top of page

SQL & Visualisation | Human cost of Covid-19

mellishamallikage

Updated: Jun 16, 2022

Understanding whether the human cost of COVID has been uniform throughout the globe.

Introduction

From Jan 2020, the world has been greatly impacted by COVID-19, a new strain of coronavirus, a type of respiratory illness. (ECDC, 2021). Many nations have seen significant loss of life and have imposed various lockdowns to curb its spread. As such, this project reviews whether some nations have suffered more significantly compared to other nations through the use of SQL in and Tableau.


On a macro level, "suffering" can be measured in a plethora of methods including economic downturn. However in this project, suffering will be measured in number of positive cases and death from COVID. The data is from bigquery-public-data.covid19_open_data.covid19_open_data.


Impact of COVID

On a basic level, one can assume that nations with a higher rate of confirmed cases will have a higher rate of deaths. This hypothesis can be evaluated though the review of the highest rate of COVID cases and the resulting deaths.


Highest rate of COVID cases

SELECT country_name, MAX(cumulative_confirmed) AS total_confirmed, FROM bigquery-public-data.covid19_open_data.covid19_open_data WHERE cumulative_confirmed is not null GROUP BY country_name ORDER BY total_confirmed desc limit 5

Highest rate of COVID deaths

SELECT country_name, MAX(cumulative_deceased ) AS total_deceased, FROM bigquery-public-data.covid19_open_data.covid19_open_data WHERE cumulative_deceased is not null GROUP BY country_name ORDER BY total_deceased desc

The majority of the nations appear in both lists. However, it is not a perfect match as UK and Mexico appear in only one list. Furthermore, plotting the deaths against the confirmed cases, shows as expected, a relationship between confirmed cases and deaths.

At approx. 5million confirmed cases, the relationship between confirmed cases and deaths appear to differ. One example of this would be Mexico which has one of the highest deaths from COVID although overall it does not have the highest number of confirmed cases. Therefore, according this data, in Mexico if you catch COVID, you are at a higher risk of dying.


This is reinforced by calculating the mortality rate for these nations. This result highlights the stark difference in the rate of death between certain nations compared to others. In Mexico, individuals have a 7.6% chance of dying whilst in the UK, it's only around 1.3%.


Mortality Rate

SELECT country_name, total_confirmed, total_deaths, ROUND((((total_deaths ) / total_confirmed)*100 ), 2) as mortality_rate FROM( SELECT country_name, MAX(cumulative_confirmed) AS total_confirmed, MAX(cumulative_deceased) AS total_deaths, FROM bigquery-public-data.covid19_open_data.covid19_open_data WHERE country_name IN ("United States of America", "India", "Brazil", "United Kingdom", "Russia", "Mexico") GROUP BY country_name ) ORDER BY mortality_rate desc

Expanding the data to cover all countries highlights the substantial difference in the mortality rate between nations. It also indicates that Mexico’s 7.6% is one of the highest mortality rates.

SELECT country_name, total_confirmed, total_deaths, ROUND((((total_deaths ) / total_confirmed)*100 ), 2) as mortality_rate FROM( SELECT country_name, MAX(cumulative_confirmed) AS total_confirmed, MAX(cumulative_deceased) AS total_deaths, FROM bigquery-public-data.covid19_open_data.covid19_open_data GROUP BY country_name ) WHERE total_confirmed <> 0 and total_deaths <> 0 ORDER BY mortality_rate desc LIMIT 5 

On a global map, the variation in the mortality rate can be seen as follows:


Furthermore, looking at the data in breakdown, it shows that Yemen’s rate of mortality is nearly 10 times higher than the average.

In addition, the data reveals that the neighbouring countries did not share a similar mortality rate. For example, Brazil’s rate is only approx. 3% compared to its neighbour Peru whose rate is nearly 8%.


Taking into account that countries such as the UK and US have a low mortality rate, economy may be a factor. However Mexico’s economy is ranked in the top 15, this is unlikely to be the only reason (WorldBank, 2021). One reason for this may be that after a certain level of confirmed cases, the healthcare system of a nation is no longer able to operate adequately, and thus resulting in higher deaths. This could also mean that some nations have a more robust healthcare system than others.


Accuracy of the data

There is a degree of disparity in the accuracy of data. Firstly, not all COVID cases are likely to be recorded due to lack of symptoms, access to tests or false negatives. According to one source, even under lab conditions the rate of false negative results are reported to be around 5% and out of test conditions the rate is expected to be higher. (Geddes, 2021). One may be able to run a sample test to confirm how prevalent COVID is in a given population, however this would require a degree of resources not available for this project.


Likewise, as COVID causes “ inflammation of the lungs”, some doctors may classify the death as pneumonia. (Gallagher, 2021) Underlying illness also may be attributed to cause of death, rather than COVID. An alternative way is to calculate the difference in death rate pre/post COVID. However this figure will include COVID related deaths but not COVID only deaths. For instance, the death caused by delays in cancer treatment will be recorded in this model but the individual did not die from COVID.


Subsequently, the accuracy of the mortality rate may also have some limitations. In addition, mortality rates can be calculated in a number of different rates each with its strengths and disadvantages (The Guardian, 2020)


Conclusion

Whilst the data is not perfect, it is clear that some nations had a higher number of positive COVID. Likewise, some nations have seen an extremely high number of confirmed cases and resulting deaths.

However, the two figures do not mirror each other. In fact, after approx. 5 million confirmed cases, other factors also appear to impact whether a country would experience a high rate of deaths from COVID. This may be due to factors such as the strength of the healthcare system and/ or the strength of the economy. Further review of the data will be needed to establish this relationship and whether there are any other factors which could determine why a nation may have a higher rate of death relative to the number of confirmed cases.


In either case, it is clear that it is key for every nation to keep COVID as low as possible.

Finally one should note that whilst countries have been badly affected by COVID, there are reports to indicate that with improved vaccination programs such countries have seen reductions in the rate of COVID. Therefore whilst the figures appear to be poor for the nations identified in this project, their situations can improve (Cancel and Jaramillo, 2021).


Appendix

Date of update

Although the countries with the leading number of confirmed cases and deaths are valid, it is helpful to review the dates of when these data were recorded. For instance if the data set were not routinely updated, the test would not be accurate. Therefore the following SQL checks that the information in the table are relatively new.

SELECT country_name, MAX(cumulative_confirmed) AS total_confirmed, max(date) as date FROM bigquery-public-data.covid19_open_data.covid19_open_data WHERE cumulative_confirmed is not null GROUP BY country_name ORDER BY total_confirmed DESC LIMIT 10

Subsequently, based upon the below output, it is indeed the case that the data is generally up to date with all the figures being collected in the space of under one month. The limit has been increased to 10 purposefully as if there is a nation in the range of 6-10 whose data is out of date, once updated they would appear within the range of the top 5 nations.

Expanding the SQL further, we can confirm that all the latest cumulative_confirmed are from 17th Dec 2021 to 16th Nov 2021, therefore covering a 31 day period.


Please be aware rerunning the SQL will yield different results as COVID remains an ongoing issue.

Works Cited

Below are some of the particular key resources used in this project.


Cancel, Daniel, and Andrea Jaramillo. “Covid-19 in South America: Vaccine Rollout Lead to Big Drop in Cases.” Bloomberg.com, 29 September 2021, https://www.bloomberg.com/news/articles/2021-09-29/south-america-gets-covid-break-from-vaccines-after-deadly-wave . Accessed 18 December 2021.


ECDC. “Questions and answers on COVID-19: Basic facts.” ECDC, 8 September 2021, https://www.ecdc.europa.eu/en/covid-19/questions-answers/questions-answers-basic-facts . Accessed 18 December 2021.


Gallagher, James. “Coronavirus: What it does to the body.” BBC, 14 March 2020, https://www.bbc.co.uk/news/health-51214864 . Accessed 20 December 2021.


Geddes, Linda. “False negative: How long does it take for coronavirus to become detectable by PCR?” Gavi, the Vaccine Alliance, 5 July 2021, https://www.gavi.org/vaccineswork/false-negative-how-long-does-it-take-coronavirus-become-detectable-pcr . Accessed 20 December 2021.


The Guardian. “Why coronavirus death rates are so different.” Youtube, 2020, https://www.youtube.com/watch?v=sMtzWVTPmLI . Accessed 9 April 2021.


WorldBank. “Mexico Overview: Development news, research, data | World Bank.” World Bank Group, 06 10 2021, https://www.worldbank.org/en/country/mexico/overview#1 . Accessed 18 December 2021.

“These so-called bleak times are necessary to go through in order to get to a much, much better place.”David Lynch
6 views0 comments

Recent Posts

See All

Comentarios


Join my mailing list

Thanks for submitting!

  • LinkedIn
  • GitHub-Mark
  • tableau icon
  • Kaggle

© 2023 by The Mountain Man. Proudly created with Wix.com

bottom of page