<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Namrata Joshi's blog]]></title><description><![CDATA[I am a healthcare professional, a health IT and data analyst enthusiast. I am transitioning my career and blogging my journey along the way.]]></description><link>https://blogs.namratajoshi.me</link><generator>RSS for Node</generator><lastBuildDate>Tue, 14 Apr 2026 21:59:18 GMT</lastBuildDate><atom:link href="https://blogs.namratajoshi.me/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Is wait time a significant challenge within the Canadian healthcare system?]]></title><description><![CDATA[We all are aware that wait time has always been a continuous struggle and is one of the greatest challenges in the Canadian Healthcare system.
The total wait time for patients to receive their medical care has significantly increased in the past few ...]]></description><link>https://blogs.namratajoshi.me/is-wait-time-a-significant-challenge-within-the-canadian-healthcare-system</link><guid isPermaLink="true">https://blogs.namratajoshi.me/is-wait-time-a-significant-challenge-within-the-canadian-healthcare-system</guid><category><![CDATA[#healthcaresystem]]></category><category><![CDATA[waittime]]></category><dc:creator><![CDATA[Namrata Joshi]]></dc:creator><pubDate>Tue, 11 Jul 2023 16:10:28 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/BXOXnQ26B7o/upload/94451fc5e3dac40d7c003b8f49bfa67f.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We all are aware that wait time has always been a continuous struggle and is one of the greatest challenges in the Canadian Healthcare system.</p>
<p>The total wait time for patients to receive their medical care has significantly increased in the past few years. The median wait time in 2021 to receive treatment after seeing a family physician and consulting with a specialist is reportedly 175% longer than in 1993.<em>(Moir &amp; Barua, 2021).</em></p>
<h2 id="heading-introduction">Introduction</h2>
<p>Having worked as a physical therapist in the USA, I came across many patients having their family members in Canada facing a long wait time to schedule elective surgeries of hip and knee replacements. I always wondered what could be the driving factors contributing to such a challenge within the healthcare system.</p>
<p>Taking a continuing education course "Understanding Canadian Healthcare System" from McMaster University Continuing Education provided me with an opportunity to learn in-depth about the structure and functionality of Canadian healthcare. I researched further on the factors contributing to the wait time and wrote an academic essay on it as a part of the assignment in November 2022.</p>
<h4 id="heading-here-is-a-link-to-my-assignment-essay">Here is a link to my assignment essay:</h4>
<p><a target="_blank" href="https://drive.google.com/file/d/1qwUqdhK9sC0b1rS6pSYrJRAUHEQgifOZ/view">Wait Time For Healthcare</a></p>
<h2 id="heading-some-of-the-key-findings">Some of the key findings</h2>
<h3 id="heading-factors-contributing-to-the-crisis">Factors contributing to the crisis</h3>
<ul>
<li><p><strong>Healthcare Funding Constraints</strong></p>
<p>  Federal healthcare funding constraints cause provincial governments to reduce healthcare resources to operate efficiently and cost-effectively. (Lampkin, 2022).Covid-19 has been one of the factors affecting federal healthcare budgets with provinces having to delay elective surgeries to care for the Covid-19 patients which resulted in further backlogs.</p>
</li>
<li><p><strong>Healthcare Worker Crisis</strong></p>
<p>  The national healthcare worker crisis has resulted in understaffing situations in healthcare facilities resulting in people having to wait in line at hospitals, emergency departments and long-term care centres to receive the care they need. (Wright, 2022).</p>
</li>
<li><p><strong>Inappropriate Hospital Bed Assignments</strong></p>
<p>  Chronic national hospital bed shortages and inappropriate bed assignments to the patients due to inadequate funding, lack of system coordination and technical framework in those sectors result in alternative levels of care at the hospital. (<em>Sutherland &amp; Crump, 2013).</em> This further increases wait time as beds are unavailable to care for acute patients.</p>
</li>
</ul>
<h3 id="heading-potential-solutions">Potential solutions</h3>
<ul>
<li><p><strong>Retain Healthcare Workforce.</strong></p>
<p>  There is an urgent requirement to retain the healthcare workforce in urban and rural communities with incentives of better salaries and working conditions, routine health and mental wellness benefits and increasing child support at work to provide job security for healthcare workers.</p>
</li>
<li><p><strong>Increase Technological Framework</strong>.</p>
<p>  There can be an integrated technological framework with interoperable EHR systems between all healthcare sectors across urban and rural communities and provinces. This integrated system can help in efficient monitoring and coordinating care with a focus on virtual care and e-referrals to timely link patients to the appropriate health providers to reduce wait time.</p>
</li>
<li><p><strong>Distribute Home-care and Long-term resources adequately.</strong></p>
<p>  There can be increased federal funding to improve home care and long-term care resources with a focus on the proper distribution of those resources between both rural and urban communities. Constructing more publicly funded long-term beds and increasing partnerships with private sectors can help meet the growing elderly population's needs to further reduce wait times.</p>
</li>
</ul>
<h2 id="heading-conclusion">Conclusion</h2>
<p>I enjoyed researching and writing an essay about the wait time and learnt a lot.</p>
<p>I believe that there are many interconnected challenges in addressing this issue despite both the federal and provincial governments taking several funding initiatives and adopting strategies.</p>
<p>There should be major systemic reforms and sustainable long-term change initiatives to be implemented in both the public and private sectors to help Canadians get their timely care.</p>
<p><strong>Thank you for taking the time to read this blog. I would appreciate any feedback or suggestions for improvement.</strong></p>
<h3 id="heading-resources">Resources</h3>
<p>Moir, M., &amp; Barua, B. (2021, December 15). Waiting your turn: Wait times for health care in Canada, 2021 report. Retrieved November 2, 2022, from <a target="_blank" href="https://www.fraserinstitute.org/studies/waiting-your-turn-wait-times-for-health-care-in-canada-2021">https://www.fraserinstitute.org/studies/waiting-your-turn-wait-times-for-health-care-in-canada-2021</a></p>
<p>Lampkin, C. (2022, April 08). Federal Budget 2022: Federal Funding for health care fails to meet Canadians' expectations. Retrieved November 2, 2022, from <a target="_blank" href="https://www.canadaspremiers.ca/federal-budget-2022-federal-funding-for-health-care-fails-to-meet-canadians-expectations/">https://www.canadaspremiers.ca/federal-budget-2022-federal-funding-for-health-care-fails-to-meet-canadians-expectations/</a></p>
<p>Wright, T. (2022, October 31). Canada's ER crisis: Doctors urge governments to stop finger-pointing and find solutions - national. Retrieved November 2, 2022, from <a target="_blank" href="https://globalnews.ca/news/9234725/canada-emergency-rooms-solutions/">https://globalnews.ca/news/9234725/canada-emergency-rooms-solutions/</a></p>
<p>Sutherland, J., &amp; Crump, R. (2013, August). Alternative level of care: Canada's Hospital Beds, the evidence and options. Retrieved November 2, 2022, from <a target="_blank" href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3999549/">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3999549/</a></p>
]]></content:encoded></item><item><title><![CDATA[Stroke Data Exploration Using Excel]]></title><description><![CDATA[Introduction
This is an Excel Dashboard project, I completed as a part of practising my Excel skills. I used the stroke dataset from Kaggle to prepare, clean and analyze the data and created a visualization dashboard using different graphs.

You can ...]]></description><link>https://blogs.namratajoshi.me/stroke-data-exploration-using-excel</link><guid isPermaLink="true">https://blogs.namratajoshi.me/stroke-data-exploration-using-excel</guid><category><![CDATA[excel]]></category><category><![CDATA[kaggle]]></category><category><![CDATA[Stroke data analysis]]></category><category><![CDATA[Data analysis project]]></category><dc:creator><![CDATA[Namrata Joshi]]></dc:creator><pubDate>Fri, 07 Jul 2023 17:20:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/KgLtFCgfC28/upload/0e1fbb6904661f99e75feeaff781d7d3.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>This is an Excel Dashboard project, I completed as a part of practising my Excel skills. I used the stroke dataset from Kaggle to prepare, clean and analyze the data and created a visualization dashboard using different graphs.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1688660956214/f312a9a5-ed08-486c-b87f-dbc0f468aa59.png" alt class="image--center mx-auto" /></p>
<h4 id="heading-you-can-find-the-link-to-my-completed-excel-project-and-the-stroke-dataset-here">You can find the link to my completed Excel project and the stroke dataset here:</h4>
<p><a target="_blank" href="https://1drv.ms/x/s!AoZxd2YkwCSTgxFXHc6G-3aKoRae?e=Ttdevu">Excel Stroke Data Exploration and Visualization</a></p>
<p><a target="_blank" href="https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset">Kaggle Dataset</a></p>
<h3 id="heading-data-preparation-cleaning-and-analysis-process">Data Preparation, Cleaning And Analysis Process</h3>
<ol>
<li>I downloaded the CSV file of the stroke dataset from Kaggle and imported it to Excel.</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1688746001980/d53bc75f-c61a-476f-ab33-b6cb790b3cb1.png" alt class="image--center mx-auto" /></p>
<ol>
<li><p>I created another sheet duplicating the data to keep the original data intact and use the new sheet for further cleaning and analysis of the data. I further named it the " Working Sheet".</p>
</li>
<li><p>I used <code>Sort</code> and <code>Filter</code> features in Excel to get the first look at the data.</p>
</li>
<li><p>I started cleaning the data by first checking for duplicate rows but none were found.</p>
</li>
<li><p>I increased the visibility of the column labels by using the formatting tools to make the labels bold and change the case of the labels.</p>
</li>
<li><p>I found an error in the <code>Work Type</code> column after using a <code>Filter</code> feature which had string value of "Children". I checked with the age column to find they were between 0-16 years of age.</p>
</li>
<li><p>I later replaced "Children" in <code>Work Type</code> column to “Unemployed” as a category for clear visualizations.</p>
</li>
<li><p>I used the <code>find</code> and <code>replace</code> features in Excel to change the values of 0 and 1 to "Yes" and "No" in the columns of Stroke, Hypertension and Heart Disease for clarity in the visualizations.</p>
</li>
<li><p>I created brackets columns for <code>Age</code> and <code>BMI</code> and named them "Age Brackets" and "BMI Brackets" using the following nested "IF" functions. This helped me create clear visualizations of those categories.</p>
</li>
</ol>
<p><code>=IF(C2&gt;=65,"Senior (65+)",IF(C2&gt;=25,"Adult (25-64)",IF(C2&gt;=24,"Youth ( 15-24)",IF(C2&lt;=14,"Child (0-14)","Invalid"))))</code></p>
<p><code>=IF(K2="Unknown","Unknown",IF(K2&gt;=30,"Obese (30+)",IF(K2&gt;=25,"Overweight (25-29.9)",IF(K2&gt;=18.5,"HealthyWeight (18.5-24.9)",IF(K2&lt;18.5,"Underweight (0 -18.4)","Unknown")))))</code></p>
<ol>
<li><p>I found and replaced "N/A" values in <code>BMI</code> column with "Unknown" for clarity in the visualizations.</p>
</li>
<li><p>I increased the consistency of the strings throughout the columns of <code>Work Type</code> by finding and replacing "Never_worked" and "Self-employed" with "Unemployed" and "Self_employed" respectively.</p>
</li>
</ol>
<h3 id="heading-key-insights-from-the-analysis">Key insights from the analysis</h3>
<ol>
<li><p>There are a total of 249 patients who had a stroke with females having higher stroke incidents than males.<br /> 56.7% in females Vs 43.37% in males.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1688747791322/af7f7ce6-ae8b-4c0d-8443-be6918a49c22.png" alt class="image--center mx-auto" /></p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1688749303101/6f8daa05-092c-45a2-a70c-b820b30a7720.png" alt class="image--center mx-auto" /></p>
</li>
<li><p><strong>Seniors</strong> had the highest percentage of strokes followed by <strong>adults</strong> and then <strong>children.</strong></p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1688749660574/dec46a60-c0ed-475f-9557-55922d2a4ffe.png" alt class="image--center mx-auto" /></p>
</li>
<li><p>People <strong>working privately</strong> had <strong>more stroke</strong> incidents than others. <strong>Unemployed male</strong> individuals had <strong>no stroke incidents</strong> while <strong>unemployed females</strong> had <strong>2 stroke incidents</strong>.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1688749785492/eadca14b-31d6-46b0-b187-e85db8ff1cf0.png" alt class="image--center mx-auto" /></p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1688749821915/c87b9f8d-81dc-45cc-814e-a043855f392d.png" alt class="image--center mx-auto" /></p>
</li>
<li><p>Individuals who <strong>did not smoke</strong> had the <strong>highest stroke incidents</strong> followed by people who formerly smoked and then the ones who smoked.</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1688748714305/777fde76-41f3-43ba-970a-abdd4cc18e0a.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1688748773176/4c567b02-85fe-4a06-804e-6b89b6608062.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-limitations-in-the-dataset">Limitations in the dataset</h3>
<ol>
<li><p>There could be a sampling bias in the data with a significantly higher number of people who did not have a stroke than the ones who had a stroke.</p>
</li>
<li><p>The time frame for which the data was collected is missing which makes it difficult to draw enough patterns in the data.</p>
</li>
<li><p>There are a significant number of nulls or missing values which I categorized as unknown.</p>
</li>
</ol>
<h3 id="heading-conclusion">Conclusion</h3>
<h4 id="heading-skills-used"><strong>Skills Used</strong></h4>
<ol>
<li><p>Data cleaning in Excel.</p>
</li>
<li><p>Creating categories/ brackets using nested IF functions to use them for visualizations.</p>
</li>
<li><p>Creating pivot tables and charts.</p>
</li>
<li><p>Creating a visualization dashboard in Excel using customized charts.</p>
</li>
<li><p>Inserting sliders to create an interactive dashboard.</p>
</li>
</ol>
<h4 id="heading-my-learnings"><strong>My learnings</strong></h4>
<p>I enjoyed using Excel for this project as I found it a very useful tool to:</p>
<ol>
<li><p>Explore the data with some built-in functionalities of sorting and filtering to get the first view of the data, identify the data format and check for errors and duplicates in the data to clean it further.</p>
</li>
<li><p>Analyze the data using pivot tables to get useful insights from the data.</p>
</li>
<li><p>Create simple and effective visualizations with customized charts.</p>
</li>
<li><p>Make the visualizations interactive by using some of its functionalities like inserting multiple slicers.</p>
</li>
</ol>
<p><strong>I appreciate your time to review my project blog and look forward to any feedback or suggestions for improvement.</strong></p>
<h4 id="heading-resources"><strong>Resources:</strong></h4>
<p><a target="_blank" href="https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset">Kaggle Stroke Dataset Prediction</a></p>
<p><a target="_blank" href="https://www.youtube.com/watch?v=qfyynHBFOsM&amp;t=1s&amp;ab_channel=AlexTheAnalyst">Alex The Analyst YouTube Portfolio Tutorial</a></p>
]]></content:encoded></item><item><title><![CDATA[Analysis Of McDonald's Menu Using Google Sheet]]></title><description><![CDATA[Introduction
Hello all,
I am Namrata Joshi, a Physical Therapist and a NASM Certified Nutrition Coach. I am advancing towards a career change and learning Excel as a part of my journey. I wanted to combine my nutrition knowledge and Excel learnings a...]]></description><link>https://blogs.namratajoshi.me/analysis-of-mcdonalds-menu-using-google-sheet</link><guid isPermaLink="true">https://blogs.namratajoshi.me/analysis-of-mcdonalds-menu-using-google-sheet</guid><category><![CDATA[data analysis]]></category><category><![CDATA[excel]]></category><category><![CDATA[beginner]]></category><category><![CDATA[Nutrition]]></category><dc:creator><![CDATA[Namrata Joshi]]></dc:creator><pubDate>Wed, 30 Nov 2022 01:49:26 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/unsplash/J0ZD8r_ClGg/upload/v1669669115561/GxRDKcNJO.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction"><strong>Introduction</strong></h2>
<p>Hello all,</p>
<p>I am Namrata Joshi, a Physical Therapist and a NASM Certified Nutrition Coach. I am advancing towards a career change and learning Excel as a part of my journey. I wanted to combine my nutrition knowledge and Excel learnings and hence decided to use this dataset of <a target="_blank" href="https://www.kaggle.com/datasets/mcdonalds/nutrition-facts">Nutrition Facts for McDonald's Menu</a> to analyze the data and create a useful dashboard. Any feedback and comments are welcomed and appreciated.</p>
<h2 id="heading-tldr">TL;DR</h2>
<p>In this analysis, I have found the unique categories of items, menu items with the most proteins and beverages/tea &amp; coffee/ smoothies &amp; shakes low-calorie items with nutrition benefits.</p>
<p>I have used multiple <code>Filters</code>, <code>Sort</code>, <code>Conditional Formatting</code>, and <code>UNIQUE</code>, <code>COUNTIF</code>, <code>SMALL</code> and <code>LARGE</code> functions using various conditions to get to the specific data. I have also inserted different charts to my filtered and sorted data to represent the analysis visually.</p>
<p>The complete analysis can be found in <a target="_blank" href="https://docs.google.com/spreadsheets/d/1bWs6a_lLeVZ0qQ-eIx3gRpyjnqZTa_9EJ_r_HM43HBs/edit?usp=sharing">this Google sheet</a>.</p>
<h2 id="heading-data-analysis">Data Analysis</h2>
<h3 id="heading-finding-unique-menu-categories">Finding unique menu categories.</h3>
<p>I created a copy of the menu in a new sheet and named it <code>menu_dataset</code>. I further applied some formats in the first heading row to get a better read of the data.</p>
<p><strong>Steps</strong></p>
<ul>
<li><p>Creation of <code>distribution of categories</code> sheet</p>
</li>
<li><p>Use of a <code>UNIQUE</code> function</p>
</li>
</ul>
<pre><code class="lang-plaintext">=UNIQUE(menu_dataset!A:A)
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669673118860/bmYHAp1y0.png" alt="image.png" /></p>
<h3 id="heading-finding-the-number-of-items-in-each-category">Finding the number of items in each category.</h3>
<p><strong>Steps</strong></p>
<ul>
<li>I used a <code>COUNTIF</code> function in the second column B.</li>
</ul>
<pre><code class="lang-plaintext">=COUNTIF(menu_dataset!A:A, ""&amp;A1)
</code></pre>
<ul>
<li>I used an autofill to copy the formula to the other cells in column B.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669675731630/a-ot6WGRm.png" alt="image.png" /></p>
<ul>
<li>Created a pie chart of unique categories and % of items in them.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669675907224/BYpZ4RPtP.png" alt="image.png" /></p>
<h3 id="heading-finding-the-high-protein-items">Finding the high protein items.</h3>
<p>I found the items containing proteins&gt;25 (the approximate amount recommended per meal for muscle protein synthesis). I later calculated % proteins( % of proteins from total calories in an item) and further retrieved a list of the top 5 items with high % protein.</p>
<p><strong>Steps</strong></p>
<ul>
<li><p>Creation of <code>Items with high % protein</code> sheet.</p>
</li>
<li><p>Finding items with protein&gt;25.</p>
</li>
</ul>
<p>I used the following filter function by selecting a range including categories, items, serving size and calories in <code>menu_dataset</code> sheet with a condition of protein&gt;25.</p>
<pre><code class="lang-plaintext">=FILTER(menu_dataset!A2:D261,menu_dataset!T2:T261&gt;25)
</code></pre>
<p>I also used the following filter function in a new column by selecting the range of the protein column in <code>menu_dataset</code>sheet with a condition of protein&gt;25.</p>
<pre><code class="lang-plaintext">=FILTER(menu_dataset!T2:T261,menu_dataset!T2:T261&gt;25)
</code></pre>
<ul>
<li>Creation of a new column (F)-<code>% protein</code>.</li>
</ul>
<p>I found out the % of proteins from the total calories of items. It was achieved by getting the total protein calories in an item first (Each gram of protein contains 4 calories) and then dividing it by the total calories of an item.</p>
<p>Hence, I created a new column and used the following formula by multiplying proteins column values by 4 and dividing by calories column values.</p>
<pre><code class="lang-plaintext">=(E2*4)/D2
</code></pre>
<p>I used an autofill to copy the formula for the other cells in column F.</p>
<ul>
<li>Inserting a note in the heading cell. I inserted the following note in the heading cell- % Protein (F1).</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669677203977/q9W5Os31S.png" alt="image.png" /></p>
<ul>
<li>Highlighting the top 5 items with high % protein. I used conditional formatting and the following LARGE function to achieve that.</li>
</ul>
<pre><code class="lang-plaintext">=$F2&gt;=LARGE($F$2:$F$38,5)
</code></pre>
<p>In the conditional format rule, I selected the range <code>A2:F38</code> and used <code>format cells</code> if the custom formula is the above function. In this, <code>LARGE</code> function applies conditional formatting to test each row if it is larger than or equal to the 5th largest value. I later custom formatted the style filling the output values with light green 1 colour.</p>
<ul>
<li>Sorting the top 5 % Protein items.</li>
</ul>
<p>I used the following <code>SORTN</code> function where I selected the range<code>($A$2:$F$38)</code>first and selected 5(as the number of results I want), 0 (as only five results to return), 6 (as the nth column to sort), and FALSE (to sort in descending order).</p>
<pre><code class="lang-plaintext">=SORTN($A$2:$F$38,5,0,6,FALSE)
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669677714928/lLF9NjRsl.png" alt="image.png" /></p>
<ul>
<li>Creating a chart listing items from highest to lowest % protein. I created a bar graph chart containing the top 5 items having the highest to lowest % protein values</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669677982831/FZxdUbEbx.png" alt="image.png" /></p>
<h3 id="heading-making-healthier-choices-in-menu-drinks">Making healthier choices in menu drinks.</h3>
<p><strong>Steps</strong></p>
<ul>
<li><p>Creating a sheet named <code>Beverages, Tea/Coffee, Smoothies/Shakes</code>.</p>
</li>
<li><p>Finding items meeting their daily recommended values for saturated fats, cholesterol and total sugars.</p>
</li>
</ul>
<p>I tried to find the items in the above categories meeting their daily recommended values for <code>saturated fats&lt;10%</code>, <code>cholesterol&lt; 300</code> and <code>sugars&lt; 50</code>(Approximate recommended values for total sugars), having some <code>proteins</code>, <code>vitamin A</code> and <code>calcium</code>. For this, I used the following filter function selecting a range of different adjacent and non-adjacent columns meeting the above conditions.</p>
<pre><code class="lang-plaintext">=FILTER({menu_dataset!A112:A261,menu_dataset!B112:B261,menu_dataset!D112:D261,menu_dataset!G112:G261,menu_dataset!P112:P261,menu_dataset!T112:T261,menu_dataset!U112:U261,menu_dataset!W112:W261},menu_dataset!I112:I261&lt;10,menu_dataset!K112:K261&lt;300,menu_dataset!S112:S261&lt;50,menu_dataset!T112:T261&lt;&gt;0,menu_dataset!U112:U261&lt;&gt;0,menu_dataset!W112:W261&lt;&gt;0)
</code></pre>
<ul>
<li>Finding top 5 less calorie items having nutrition benefits.</li>
</ul>
<p>I used <code>conditional formatting</code> and the following <code>SMALL</code> function to highlight the top 5 low-calorie items.</p>
<pre><code class="lang-plaintext">=$C2&lt;=SMALL($C$2:$C$16,5)
</code></pre>
<p>In the conditional format rule, I selected the range <code>A2:H16</code>, used <code>format cells</code> if the custom formula is the above function. In the formula,<code>SMALL</code> function applies conditional formatting to test each row if it is smaller than or equal to the 5th smallest value. I later custom formatted the style filling the output values with light green 1 colour.</p>
<ul>
<li>Sorting 5 low-calorie items in their ascending order using the <code>SORTN</code> function.</li>
</ul>
<p>a) 5 low-calorie items with % daily value distribution.</p>
<p>This included sorting 5 low-calorie items and their % daily values of Total fat, carbohydrate, Vitamin A and Calcium using the following function.</p>
<pre><code class="lang-plaintext">=SORTN({$A$2:$E$16,$G$2:$H$16},5,0,3,TRUE)
</code></pre>
<p>I sorted a few columns with 2 different ranges containing the columns of category, Items, Calories, Total fat(% daily value), Carbohydrate(% daily value), Vitamin A(% daily value) and Calcium( % daily value). I later selected 5(as the number of results I want), 0 (as only 5 results to return), 3 (as the 3rd column of calories to sort), TRUE (to sort in ascending order).</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669689504463/9gIFY6YV-.png" alt="image.png" /></p>
<ul>
<li>Bar graph chart</li>
</ul>
<p>I hid the column of calories and inserted a bar graph chart selecting the columns of Items and (Total fats, carbohydrates, Vitamin A, Calcium) % daily values.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669690163111/ukz2KW-pe.png" alt="image.png" /></p>
<p>b) 5 low-calorie items with Protein distribution</p>
<p>This includes sorting 5 low-calorie items with their protein values using the following function.</p>
<pre><code class="lang-plaintext">=SORTN({$B$2:$C$16,$F$2:$F$16},5,0,2,TRUE)
</code></pre>
<p>I sorted a few columns with 2 different ranges containing the columns of Items, Calories and Protein. I later selected 5(as the number of results I want), 0 (as only 5 results to return), 2 (as the 2nd column of calories to sort), TRUE(sort in ascending order)</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669689924278/KBHFhxdsk.png" alt="image.png" /></p>
<ul>
<li>Bar graph chart</li>
</ul>
<p>I hid the column of Calories and inserted a bar graph chart selecting the columns of Items and Protein.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669690359675/1WTmnDrEH.png" alt="image.png" /></p>
<h2 id="heading-lessons-learnt">Lessons Learnt</h2>
<ol>
<li><p>I learnt effective ways to use the FILTER and SORT functions in this large dataset to get to the specific data.</p>
</li>
<li><p>I learnt a new way to select a range, especially for the non-adjacent columns which helped me to use the specific columns for the analysis.</p>
</li>
<li><p>I found the LARGE and SMALL functions very effective in the conditional formatting to highlight the rows meeting the specific conditions for a better visual representation of the data.</p>
</li>
<li><p>I learnt to customize my inserted charts to label the data in the Google sheet.</p>
</li>
</ol>
<h2 id="heading-resources">Resources</h2>
<p><a target="_blank" href="https://www.kaggle.com/datasets/mcdonalds/nutrition-facts">Kaggle dataset of Nutrition Facts for McDonald's Menu</a></p>
<p><a target="_blank" href="https://docs.google.com/spreadsheets/d/1bWs6a_lLeVZ0qQ-eIx3gRpyjnqZTa_9EJ_r_HM43HBs/edit#gid=586829976">Google sheet with detailed analysis</a></p>
]]></content:encoded></item><item><title><![CDATA[Learning SQL and solving Murder Mystery]]></title><description><![CDATA[Introduction
Hello there,
My name is Namrata Joshi, coming from a medical background who practiced physical therapy for almost 12 years. I recently started learning SQL as I am advancing towards a career change and have decided to blog my journey. I ...]]></description><link>https://blogs.namratajoshi.me/learning-sql-and-solving-murder-mystery</link><guid isPermaLink="true">https://blogs.namratajoshi.me/learning-sql-and-solving-murder-mystery</guid><category><![CDATA[SQL]]></category><category><![CDATA[Beginner Developers]]></category><category><![CDATA[database]]></category><category><![CDATA[Learning Journey]]></category><category><![CDATA[Games]]></category><dc:creator><![CDATA[Namrata Joshi]]></dc:creator><pubDate>Wed, 23 Nov 2022 04:54:54 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/unsplash/9tamF4J0vLk/upload/v1669075604222/V9cxDah_U.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1 id="heading-introduction">Introduction</h1>
<p>Hello there,</p>
<p>My name is Namrata Joshi, coming from a medical background who practiced physical therapy for almost 12 years. I recently started learning SQL as I am advancing towards a career change and have decided to blog my journey. I found this amazing online source that helped me to not only practice SQL but also learn in an interesting and fun way. I hope this blog comes in handy to anyone who is trying to solve the <a target="_blank" href="https://mystery.knightlab.com/">SQL City Murder Mystery.</a></p>
<h2 id="heading-step-1-finding-crime-scene-report">Step 1 : Finding crime scene report</h2>
<p>I retrieved the crime scene report from the police department’s database with the information which is provided( date= 15th Jan 2018 and city= SQL City, type = Murder). I was able to get the date format by running a query( <code>select * from crime_scene_report</code> ) which helped me to use that in the query below to get a description in the <code>crime_scene_report</code> with the matching criteria.</p>
<h3 id="heading-query">Query</h3>
<pre><code>SELECT * 
<span class="hljs-keyword">from</span> crime_scene_report 
where date=<span class="hljs-string">"20180115"</span> 
and city = <span class="hljs-string">"SQL City"</span>;
</code></pre><h3 id="heading-output">Output</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669066897238/1Oj5OetJr.png" alt="image.png" /></p>
<h2 id="heading-step-2-finding-first-witness">Step 2 : Finding First Witness</h2>
<p>From the above description, there is a clue that there are 2 witnesses. I used the first witness information- “Lives at the last house on Northwestern Dr” to run a query in <code>Person</code> table. This helped me find the name and <code>address_number</code> columns with matching criteria of <code>address_street_name= “Northwestern Dr”</code>. I used <code>ORDER BY</code> to arrange the <code>address_number</code> column data in a descending order( sorting the address_number from last to first) and limited my output result to 3 to get the name of the first result output being the first witness.</p>
<h3 id="heading-query-1">Query</h3>
<pre><code>Select name, 
address_number
<span class="hljs-keyword">from</span> person
where address_street_name = <span class="hljs-string">"Northwestern Dr"</span>
Order by address_number desc
Limit <span class="hljs-number">3</span>;
</code></pre><h3 id="heading-output-1">Output</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669076125385/sryzGB5Xy.png" alt="image.png" /></p>
<p>1st witness- Morty Schapiro</p>
<h2 id="heading-step-3-finding-second-witness">Step 3 : Finding Second Witness</h2>
<p>There is another clue in the retrieved <code>crime_scene_report</code> description about the second witness whose name is “Annabel” and lives somewhere on “Franklin Ave”. I used the <code>Person</code> table to find the <code>name</code> and <code>address_number</code> of a person. I used a <code>WHERE</code> Clause inputting the provided information of <code>address_street_name = “Franklin Ave”</code> and used a wildcard  <code>%’</code> later to get a more precise search. For example, using <code>LIKE</code> and <code>%</code> after Annabel to retrieve all the data after the word Annabel. This helped me get the full name of the 2nd witness. </p>
<h3 id="heading-query-2">Query</h3>
<pre><code>Select name, 
address_number
<span class="hljs-keyword">from</span> person
where address_street_name = <span class="hljs-string">"Franklin Ave"</span>
and name like <span class="hljs-string">'Annabel%'</span>;
</code></pre><h3 id="heading-output-2">Output</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669076323982/MHJ50iiVB.png" alt="image.png" /></p>
<p>2nd Witness- Annabel Miller</p>
<h2 id="heading-step-4-finding-the-witnesses-interview-transcripts">Step 4: Finding the witnesses interview transcripts</h2>
<p>After getting the full names of the first and 2nd witnesses, it was important to find their interview transcripts individually from the <code>interview</code> table. Hence, I queried both the <code>Interview</code> and <code>Person</code> tables as the transcript was needed from the interview table and the name had to be used in the person table. I used a <code>WHERE</code> clause to join two tables using the primary key<code>id</code> from the person table and foreign key <code>person_id</code>.) and used an <code>AND</code> clause to input the name of each witness in individual queries to get their interview transcripts.</p>
<h3 id="heading-1-morty-schapiro">1) Morty Schapiro</h3>
<h4 id="heading-query-3">Query</h4>
<pre><code>  select person.name, 
  interview.transcript
  <span class="hljs-keyword">from</span> person, interview
  where person.id = interview.person_id
  and person.name = <span class="hljs-string">'Morty Schapiro'</span>;
</code></pre><h4 id="heading-output-3">Output</h4>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669076699196/iXRHFICXj.png" alt="image.png" /></p>
<h3 id="heading-2-annabel-miller">2) Annabel Miller</h3>
<h4 id="heading-query-4">Query</h4>
<pre><code>select person.name, 
interview.transcript
<span class="hljs-keyword">from</span> person, interview
where person.id = interview.person_id
and person.name = <span class="hljs-string">'Annabel Miller'</span>;
</code></pre><h4 id="heading-output-4">Output</h4>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669076797257/z7NU8XOKu.png" alt="image.png" /></p>
<h2 id="heading-step-5-finding-gym-members">Step 5: Finding Gym Members</h2>
<p>I combined the interview transcript information of both the witnesses as they were related. Both said that the murderer was recognised as the "Get Fit Now Gym member". Hence I had to join two tables and use a <code>get_fit_now_member</code> table to get the <code>name</code>, <code>id</code> and <code>membership_status</code> information. I used the provided information of <code>check_in_date</code> from <code>get_fit_now_check_in table</code> and <code>membership_id</code> starting with “48Z” from <code>get_fit_now_member</code> table.  I added the <code>WHERE</code> clause to join the two tables using the primary key <code>id</code> from <code>get_fit_now_member</code> table and the foreign key of <code>membership_id</code> from <code>get_fit_now_check_in</code> table. I further applied the <code>AND’</code>clause and a wildcard <code>%</code> to input the exact information provided by the witnesses. This helped me get a precise output data on names of gold membership status members with their membership ids starting with “48Z”.</p>
<h3 id="heading-query-5">Query</h3>
<pre><code>select get_fit_now_member.name, 
get_fit_now_member.id,
get_fit_now_member.membership_status
<span class="hljs-keyword">from</span> get_fit_now_member, get_fit_now_check_in
where get_fit_now_member.id  = get_fit_now_check_in.membership_id
and (get_fit_now_check_in.check_in_date = <span class="hljs-string">'20180109'</span> 
and get_fit_now_member.id like <span class="hljs-string">'48Z%'</span>);
</code></pre><h3 id="heading-output-5">Output</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669077992802/kd7nr7nb6.png" alt="image.png" /></p>
<h2 id="heading-step-6-querying-the-driverslicense-table">Step 6: Querying the drivers_license table</h2>
<p>Using the license plate information provided by "Morty Schapiro", I wanted to find the name of the person having a car license plate number containing “H42W”. For this, I used the two tables of
<code>person</code> and <code>drivers_license</code>. I joined two tables using a <code>WHERE</code> clause matching the primary key of <code>id</code> from <code>drivers_license</code> table and foreign key of <code>license_id</code> from <code>person</code> table to retrieve the common data present in both tables. I used an <code>AND</code> clause and a wildcard <code>%</code> after “H42W” to get the list of names and their respective drivers license plate numbers containing the input data. </p>
<h3 id="heading-query-6">Query</h3>
<pre><code>Select person.name,
drivers_license.plate_number
<span class="hljs-keyword">from</span> person, drivers_license
Where person.license_id= drivers_license.id
and drivers_license.plate_number like <span class="hljs-string">'%H42W%'</span>;
</code></pre><h3 id="heading-output-6">Output</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669077226977/6VXEjfuhw.png" alt="image.png" /></p>
<h2 id="heading-step-7-we-found-the-murderer">Step 7: We found the murderer</h2>
<p>The murderer is “Jeremy Bowers” after merging the outputs of step 5 and step 6. </p>
<h2 id="heading-step-8-verification">Step 8: Verification</h2>
<h3 id="heading-a-checking-jeremy-bowers-interview-transcript">a) Checking "Jeremy Bowers" interview transcript.</h3>
<p>I was able to find the interview transcript of Jeremy Bowers using both <code>person</code> and <code>interview</code> tables. I used those two tables as<code>id</code> and <code>name</code> columns were present in <code>person</code> tables and <code>transcript</code> column was present in the <code>interview</code> table. I joined those tables using the primary key of <code>id</code> in the <code>person</code> table and foreign key of <code>person_id</code> in the <code>interview</code> table with a <code>WHERE</code> clause. I used an <code>AND</code> clause to specify <code>name</code> specifically being ''Jeremy Bowers''.</p>
<h4 id="heading-query-7">Query</h4>
<pre><code>Select person.id, 
person.name,
interview.transcript
<span class="hljs-keyword">from</span> person,interview
where person.id = interview.person_id
and person.name= <span class="hljs-string">'Jeremy Bowers'</span>;
</code></pre><h4 id="heading-output-7">Output</h4>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669077433300/DQxmEoGfX.png" alt="image.png" /></p>
<h3 id="heading-b-finding-the-brain-behind-the-murder">b) Finding the brain behind the murder</h3>
<h4 id="heading-option-1">Option 1</h4>
<p>I had to find the <code>name</code> of the person and the person <code>id</code> matching the <code>transcript</code> description provided by Jeremy Bowers. I used these 4 tables of <code>person</code>, <code>drivers_license</code>, <code>income</code> and <code>facebook_event_checkin</code> in the query below. I joined the tables using their keys with a <code>WHERE</code> and multiple <code>AND</code> clauses. I input the information of hair color being red, car make being Tesla and gender being female to retrieve specific data matching those information. </p>
<h4 id="heading-query-8">Query</h4>
<pre><code>select person.id,
person.name
<span class="hljs-keyword">from</span> person,drivers_license, income, facebook_event_checkin
where person.license_id = drivers_license.id
and 
(person.id  = facebook_event_checkin.person_id 
and person.ssn = income.ssn and drivers_license.hair_color= <span class="hljs-string">'red'</span>
and drivers_license.car_make= <span class="hljs-string">'Tesla'</span> 
and drivers_license.gender = <span class="hljs-string">'female'</span>);
</code></pre><h4 id="heading-output-8">Output</h4>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669077610362/_BFM1DQ0N.png" alt="image.png" /></p>
<h4 id="heading-option-2">Option 2</h4>
<p>I used a subquery first to find the person's id, person’s name, height, car model, income from 3 tables of <code>person</code>, <code>drivers_license</code> and <code>income</code>. I also included a <code>WHERE</code> and multiple <code>AND</code> clauses to join the three tables matching their keys and input the hair color, car make and gender information as derived from Jeremy’s interview transcript. I later used an <code>ORDER BY</code> to sort the <code>annual_income</code> column data in the descending order. I later ran a query to find <code>name</code>and <code>id</code> of the person from the above subquery as well as from the <code>facebook_event_checkin</code> table. It was important to join the <code>person</code>and <code>facebook_event_checkin</code> tables using keys of <code>id</code> and <code>person_id</code> respectively to retrieve the matching data. </p>
<h4 id="heading-query-9">Query</h4>
<pre><code>select id, name 
<span class="hljs-keyword">from</span> 
       (select person.id,
        person.name,
        drivers_license.height,
        drivers_license.car_model,
        income.annual_income
        <span class="hljs-keyword">from</span> person,drivers_license, income
        where person.license_id = drivers_license.id
        and (person.ssn = income.ssn 
              and drivers_license.hair_color= <span class="hljs-string">'red'</span>
              and drivers_license.car_make= <span class="hljs-string">'Tesla'</span> 
              and drivers_license.gender = <span class="hljs-string">'female'</span>)
        Order by income.annual_income desc)
,facebook_event_checkin 
where id = facebook_event_checkin.person_id;
</code></pre><h4 id="heading-output-9">Output</h4>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1669077711326/XaH03V6qB.png" alt="image.png" /></p>
<h2 id="heading-mystery-solved">Mystery Solved</h2>
<h3 id="heading-brain-behind-the-murder">Brain Behind the murder</h3>
<p>Miranda Priestly</p>
<h2 id="heading-lessons-learnt">Lessons Learnt</h2>
<ol>
<li>It was helpful to first query all the different tables to get a better read of the inside data.</li>
<li>I learnt to think creatively to join different tables depending upon the information I was getting from each query. </li>
<li>It was fun to play around with a few wildcards and filters to retrieve the exact data although it took some trials and errors to get rid of the few syntax errors. </li>
<li>I could also have used a combination of different JOINS in my queries although it was easier to filter the data using WHERE and AND clauses.</li>
<li>Readability was an issue in some of my subqueries until I started to add some indentations.</li>
<li>It is always helpful to think about multiple ways to retrieve the required data.</li>
</ol>
<h2 id="heading-resources">Resources</h2>
<p><a target="_blank" href="https://mystery.knightlab.com/">SQL City Murder Mystery</a></p>
]]></content:encoded></item></channel></rss>