# Learning SQL and solving Murder Mystery




# Introduction

Hello there,

My name is Namrata Joshi, coming from a medical background who practiced physical therapy for almost 12 years. I recently started learning SQL as I am advancing towards a career change and have decided to blog my journey. I found this amazing online source that helped me to not only practice SQL but also learn in an interesting and fun way. I hope this blog comes in handy to anyone who is trying to solve the [SQL City Murder Mystery.](https://mystery.knightlab.com/)


## Step 1 : Finding crime scene report

I retrieved the crime scene report from the police department’s database with the information which is provided( date= 15th Jan 2018 and city= SQL City, type = Murder). I was able to get the date format by running a query( `select * from crime_scene_report` ) which helped me to use that in the query below to get a description in the `crime_scene_report` with the matching criteria.

### Query

```
SELECT * 
from crime_scene_report 
where date="20180115" 
and city = "SQL City";
```

### Output

![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1669066897238/1Oj5OetJr.png align="left")


## Step 2 : Finding First Witness

From the above description, there is a clue that there are 2 witnesses. I used the first witness information- “Lives at the last house on Northwestern Dr” to run a query in `Person` table. This helped me find the name and `address_number` columns with matching criteria of `address_street_name= “Northwestern Dr”`. I used `ORDER BY` to arrange the `address_number` column data in a descending order( sorting the address_number from last to first) and limited my output result to 3 to get the name of the first result output being the first witness.

### Query

```
Select name, 
address_number
from person
where address_street_name = "Northwestern Dr"
Order by address_number desc
Limit 3;
```

### Output

![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1669076125385/sryzGB5Xy.png align="left")

1st witness- Morty Schapiro


## Step 3 : Finding Second Witness

There is another clue in the retrieved `crime_scene_report` description about the second witness whose name is “Annabel” and lives somewhere on “Franklin Ave”. I used the `Person` table to find the `name` and `address_number` of a person. I used a `WHERE` Clause inputting the provided information of `address_street_name = “Franklin Ave”` and used a wildcard  `%’` later to get a more precise search. For example, using `LIKE` and `%` after Annabel to retrieve all the data after the word Annabel. This helped me get the full name of the 2nd witness. 

### Query

```
Select name, 
address_number
from person
where address_street_name = "Franklin Ave"
and name like 'Annabel%';
```

### Output

![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1669076323982/MHJ50iiVB.png align="left")

2nd Witness- Annabel Miller


## Step 4: Finding the witnesses interview transcripts

After getting the full names of the first and 2nd witnesses, it was important to find their interview transcripts individually from the `interview` table. Hence, I queried both the `Interview` and `Person` tables as the transcript was needed from the interview table and the name had to be used in the person table. I used a `WHERE` clause to join two tables using the primary key`id` from the person table and foreign key `person_id`.) and used an `AND` clause to input the name of each witness in individual queries to get their interview transcripts.

### 1) Morty Schapiro

#### Query

  ```          
  select person.name, 
  interview.transcript
  from person, interview
  where person.id = interview.person_id
  and person.name = 'Morty Schapiro';
```

#### Output

![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1669076699196/iXRHFICXj.png align="left")

### 2) Annabel Miller

#### Query

```           
select person.name, 
interview.transcript
from person, interview
where person.id = interview.person_id
and person.name = 'Annabel Miller';
```

#### Output

![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1669076797257/z7NU8XOKu.png align="left")


## Step 5: Finding Gym Members

I combined the interview transcript information of both the witnesses as they were related. Both said that the murderer was recognised as the "Get Fit Now Gym member". Hence I had to join two tables and use a `get_fit_now_member` table to get the `name`, `id` and `membership_status` information. I used the provided information of `check_in_date` from `get_fit_now_check_in table` and `membership_id` starting with “48Z” from `get_fit_now_member` table.  I added the `WHERE` clause to join the two tables using the primary key `id` from `get_fit_now_member` table and the foreign key of `membership_id` from `get_fit_now_check_in` table. I further applied the `AND’`clause and a wildcard `%` to input the exact information provided by the witnesses. This helped me get a precise output data on names of gold membership status members with their membership ids starting with “48Z”.

### Query

```
select get_fit_now_member.name, 
get_fit_now_member.id,
get_fit_now_member.membership_status
from get_fit_now_member, get_fit_now_check_in
where get_fit_now_member.id  = get_fit_now_check_in.membership_id
and (get_fit_now_check_in.check_in_date = '20180109' 
and get_fit_now_member.id like '48Z%');
```

### Output

![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1669077992802/kd7nr7nb6.png align="left")


## Step 6: Querying the drivers_license table

Using the license plate information provided by "Morty Schapiro", I wanted to find the name of the person having a car license plate number containing “H42W”. For this, I used the two tables of
`person` and `drivers_license`. I joined two tables using a `WHERE` clause matching the primary key of `id` from `drivers_license` table and foreign key of `license_id` from `person` table to retrieve the common data present in both tables. I used an `AND` clause and a wildcard `%` after “H42W” to get the list of names and their respective drivers license plate numbers containing the input data. 

### Query

```
Select person.name,
drivers_license.plate_number
from person, drivers_license
Where person.license_id= drivers_license.id
and drivers_license.plate_number like '%H42W%';
```

### Output

![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1669077226977/6VXEjfuhw.png align="left")


## Step 7: We found the murderer

The murderer is “Jeremy Bowers” after merging the outputs of step 5 and step 6. 


## Step 8: Verification

### a) Checking "Jeremy Bowers" interview transcript.

I was able to find the interview transcript of Jeremy Bowers using both `person` and `interview` tables. I used those two tables as`id` and `name` columns were present in `person` tables and `transcript` column was present in the `interview` table. I joined those tables using the primary key of `id` in the `person` table and foreign key of `person_id` in the `interview` table with a `WHERE` clause. I used an `AND` clause to specify `name` specifically being ''Jeremy Bowers''.

#### Query

```
Select person.id, 
person.name,
interview.transcript
from person,interview
where person.id = interview.person_id
and person.name= 'Jeremy Bowers';
```

#### Output

![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1669077433300/DQxmEoGfX.png align="left")


### b) Finding the brain behind the murder

#### Option 1

I had to find the `name` of the person and the person `id` matching the `transcript` description provided by Jeremy Bowers. I used these 4 tables of `person`, `drivers_license`, `income` and `facebook_event_checkin` in the query below. I joined the tables using their keys with a `WHERE` and multiple `AND` clauses. I input the information of hair color being red, car make being Tesla and gender being female to retrieve specific data matching those information. 

#### Query

```
select person.id,
person.name
from person,drivers_license, income, facebook_event_checkin
where person.license_id = drivers_license.id
and 
(person.id  = facebook_event_checkin.person_id 
and person.ssn = income.ssn and drivers_license.hair_color= 'red'
and drivers_license.car_make= 'Tesla' 
and drivers_license.gender = 'female');
```

#### Output

![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1669077610362/_BFM1DQ0N.png align="left")


#### Option 2

I used a subquery first to find the person's id, person’s name, height, car model, income from 3 tables of `person`, `drivers_license` and `income`. I also included a `WHERE` and multiple `AND` clauses to join the three tables matching their keys and input the hair color, car make and gender information as derived from Jeremy’s interview transcript. I later used an `ORDER BY` to sort the `annual_income` column data in the descending order. I later ran a query to find `name`and `id` of the person from the above subquery as well as from the `facebook_event_checkin` table. It was important to join the `person`and `facebook_event_checkin` tables using keys of `id` and `person_id` respectively to retrieve the matching data. 

#### Query

```
select id, name 
from 
       (select person.id,
        person.name,
        drivers_license.height,
        drivers_license.car_model,
        income.annual_income
        from person,drivers_license, income
        where person.license_id = drivers_license.id
        and (person.ssn = income.ssn 
              and drivers_license.hair_color= 'red'
              and drivers_license.car_make= 'Tesla' 
              and drivers_license.gender = 'female')
        Order by income.annual_income desc)
,facebook_event_checkin 
where id = facebook_event_checkin.person_id;
```

#### Output

![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1669077711326/XaH03V6qB.png align="left")


## Mystery Solved

### Brain Behind the murder
Miranda Priestly


## Lessons Learnt

1. It was helpful to first query all the different tables to get a better read of the inside data.
1. I learnt to think creatively to join different tables depending upon the information I was getting from each query. 
1. It was fun to play around with a few wildcards and filters to retrieve the exact data although it took some trials and errors to get rid of the few syntax errors. 
1. I could also have used a combination of different JOINS in my queries although it was easier to filter the data using WHERE and AND clauses.
1. Readability was an issue in some of my subqueries until I started to add some indentations.
1. It is always helpful to think about multiple ways to retrieve the required data.

  
## Resources

[SQL City Murder Mystery](https://mystery.knightlab.com/)















              


































