Learning SQL and solving Murder Mystery

Introduction

Hello there,

My name is Namrata Joshi, coming from a medical background who practiced physical therapy for almost 12 years. I recently started learning SQL as I am advancing towards a career change and have decided to blog my journey. I found this amazing online source that helped me to not only practice SQL but also learn in an interesting and fun way. I hope this blog comes in handy to anyone who is trying to solve the SQL City Murder Mystery.

Step 1 : Finding crime scene report

I retrieved the crime scene report from the police department’s database with the information which is provided( date= 15th Jan 2018 and city= SQL City, type = Murder). I was able to get the date format by running a query( select * from crime_scene_report ) which helped me to use that in the query below to get a description in the crime_scene_report with the matching criteria.

Query

SELECT * 
from crime_scene_report 
where date="20180115" 
and city = "SQL City";

Output

image.png

Step 2 : Finding First Witness

From the above description, there is a clue that there are 2 witnesses. I used the first witness information- “Lives at the last house on Northwestern Dr” to run a query in Person table. This helped me find the name and address_number columns with matching criteria of address_street_name= “Northwestern Dr”. I used ORDER BY to arrange the address_number column data in a descending order( sorting the address_number from last to first) and limited my output result to 3 to get the name of the first result output being the first witness.

Query

Select name, 
address_number
from person
where address_street_name = "Northwestern Dr"
Order by address_number desc
Limit 3;

Output

image.png

1st witness- Morty Schapiro

Step 3 : Finding Second Witness

There is another clue in the retrieved crime_scene_report description about the second witness whose name is “Annabel” and lives somewhere on “Franklin Ave”. I used the Person table to find the name and address_number of a person. I used a WHERE Clause inputting the provided information of address_street_name = “Franklin Ave” and used a wildcard %’ later to get a more precise search. For example, using LIKE and % after Annabel to retrieve all the data after the word Annabel. This helped me get the full name of the 2nd witness.

Query

Select name, 
address_number
from person
where address_street_name = "Franklin Ave"
and name like 'Annabel%';

Output

image.png

2nd Witness- Annabel Miller

Step 4: Finding the witnesses interview transcripts

After getting the full names of the first and 2nd witnesses, it was important to find their interview transcripts individually from the interview table. Hence, I queried both the Interview and Person tables as the transcript was needed from the interview table and the name had to be used in the person table. I used a WHERE clause to join two tables using the primary keyid from the person table and foreign key person_id.) and used an AND clause to input the name of each witness in individual queries to get their interview transcripts.

1) Morty Schapiro

Query

  select person.name, 
  interview.transcript
  from person, interview
  where person.id = interview.person_id
  and person.name = 'Morty Schapiro';

Output

image.png

2) Annabel Miller

Query

select person.name, 
interview.transcript
from person, interview
where person.id = interview.person_id
and person.name = 'Annabel Miller';

Output

image.png

Step 5: Finding Gym Members

I combined the interview transcript information of both the witnesses as they were related. Both said that the murderer was recognised as the "Get Fit Now Gym member". Hence I had to join two tables and use a get_fit_now_member table to get the name, id and membership_status information. I used the provided information of check_in_date from get_fit_now_check_in table and membership_id starting with “48Z” from get_fit_now_member table. I added the WHERE clause to join the two tables using the primary key id from get_fit_now_member table and the foreign key of membership_id from get_fit_now_check_in table. I further applied the AND’clause and a wildcard % to input the exact information provided by the witnesses. This helped me get a precise output data on names of gold membership status members with their membership ids starting with “48Z”.

Query

select get_fit_now_member.name, 
get_fit_now_member.id,
get_fit_now_member.membership_status
from get_fit_now_member, get_fit_now_check_in
where get_fit_now_member.id  = get_fit_now_check_in.membership_id
and (get_fit_now_check_in.check_in_date = '20180109' 
and get_fit_now_member.id like '48Z%');

Output

image.png

Step 6: Querying the drivers_license table

Using the license plate information provided by "Morty Schapiro", I wanted to find the name of the person having a car license plate number containing “H42W”. For this, I used the two tables of person and drivers_license. I joined two tables using a WHERE clause matching the primary key of id from drivers_license table and foreign key of license_id from person table to retrieve the common data present in both tables. I used an AND clause and a wildcard % after “H42W” to get the list of names and their respective drivers license plate numbers containing the input data.

Query

Select person.name,
drivers_license.plate_number
from person, drivers_license
Where person.license_id= drivers_license.id
and drivers_license.plate_number like '%H42W%';

Output

image.png

Step 7: We found the murderer

The murderer is “Jeremy Bowers” after merging the outputs of step 5 and step 6.

Step 8: Verification

a) Checking "Jeremy Bowers" interview transcript.

I was able to find the interview transcript of Jeremy Bowers using both person and interview tables. I used those two tables asid and name columns were present in person tables and transcript column was present in the interview table. I joined those tables using the primary key of id in the person table and foreign key of person_id in the interview table with a WHERE clause. I used an AND clause to specify name specifically being ''Jeremy Bowers''.

Query

Select person.id, 
person.name,
interview.transcript
from person,interview
where person.id = interview.person_id
and person.name= 'Jeremy Bowers';

Output

image.png

b) Finding the brain behind the murder

Option 1

I had to find the name of the person and the person id matching the transcript description provided by Jeremy Bowers. I used these 4 tables of person, drivers_license, income and facebook_event_checkin in the query below. I joined the tables using their keys with a WHERE and multiple AND clauses. I input the information of hair color being red, car make being Tesla and gender being female to retrieve specific data matching those information.

Query

select person.id,
person.name
from person,drivers_license, income, facebook_event_checkin
where person.license_id = drivers_license.id
and 
(person.id  = facebook_event_checkin.person_id 
and person.ssn = income.ssn and drivers_license.hair_color= 'red'
and drivers_license.car_make= 'Tesla' 
and drivers_license.gender = 'female');

Output

image.png

Option 2

I used a subquery first to find the person's id, person’s name, height, car model, income from 3 tables of person, drivers_license and income. I also included a WHERE and multiple AND clauses to join the three tables matching their keys and input the hair color, car make and gender information as derived from Jeremy’s interview transcript. I later used an ORDER BY to sort the annual_income column data in the descending order. I later ran a query to find nameand id of the person from the above subquery as well as from the facebook_event_checkin table. It was important to join the personand facebook_event_checkin tables using keys of id and person_id respectively to retrieve the matching data.

Query

select id, name 
from 
       (select person.id,
        person.name,
        drivers_license.height,
        drivers_license.car_model,
        income.annual_income
        from person,drivers_license, income
        where person.license_id = drivers_license.id
        and (person.ssn = income.ssn 
              and drivers_license.hair_color= 'red'
              and drivers_license.car_make= 'Tesla' 
              and drivers_license.gender = 'female')
        Order by income.annual_income desc)
,facebook_event_checkin 
where id = facebook_event_checkin.person_id;

Output

image.png

Mystery Solved

Brain Behind the murder

Miranda Priestly

Lessons Learnt

  1. It was helpful to first query all the different tables to get a better read of the inside data.
  2. I learnt to think creatively to join different tables depending upon the information I was getting from each query.
  3. It was fun to play around with a few wildcards and filters to retrieve the exact data although it took some trials and errors to get rid of the few syntax errors.
  4. I could also have used a combination of different JOINS in my queries although it was easier to filter the data using WHERE and AND clauses.
  5. Readability was an issue in some of my subqueries until I started to add some indentations.
  6. It is always helpful to think about multiple ways to retrieve the required data.

Resources

SQL City Murder Mystery