Timestamp
1990-2019
Data source
Article: owidmalaria
Author: Max Roser, Hannah Ritchie
Journal: Our World in Data
Source: https://ourworldindata.org/malaria
License: https://creativecommons.org/publicdomain/zero/1.0/
Dataset
Metadata
Entity (Country/Region): The specific countries or regions under consideration
Code: The 3-letter code of the country
Year: The timeframe spanning from the earliest to the latest data
Deaths - Malaria - Sex: Both - Age: Age-standardized (Rate) The death rates due to malaria, expressed as the number of deaths per 100,000 individuals, for each country and each year, considering both sexes
Protocol
Data quality check:
- Examine the dataset for outliers and missing values
- Address missing values and outliers through proper measures
Country selection:
- Manually filter out non-African countries, as there is no dedicated continent column available.
Column filtration:
- Remove unnecessary columns, such as the ‘Code’ column
Aggregation:
- Group the data by year, transforming it into a structured format where each year is a column and the corresponding values represent the death rate for each African country.
Enhancement for Race Plot in Flourish:
- In order to generate a comprehensive race plot using Flourish, an additional dataset containing country flags was obtained. The flags in the Flourish dataset were found to be incomplete. (reference: [Countries ISO Codes, Continent, Flags, and URL Dataset])(https://www.kaggle.com/datasets/andreshg/countries-iso-codes-continent-flags-url).
- To align the country names in this new flag dataset with our primary dataset, adjustments were made to ensure accuracy. This step involved harmonizing the country names to match the format used in our dataset, facilitating the proper association of flags with their respective countries.
Flourish configuration:
- ID: Country
- Name: Country
- Label: Year