This project performs data cleaning, statistical analysis, and visual exploration of multiple World Bank indicators across selected countries. It generates line plots, bar charts, and correlation heatmaps for deeper insight into global trends such as agriculture land use, forest area, CO₂ emissions, renewable energy consumption, urban population, and mortality rates.
The script:
-
Loads and filters World Bank CSV datasets
-
Selects six countries for analysis: France, India, Netherlands, Hungary, Germany, Australia
-
Extracts data from 1990–2020
-
Generates:
- Statistical summaries
- Line plots (1990→2020)
- Bar charts for selected years
- Correlation heatmaps
-
Saves figures as PNG files
Each dataset corresponds to a World Bank indicator:
| Indicator | File | Description |
|---|---|---|
| Agriculture land (% of land area) | API_AG.LND.AGRI.ZS_DS2... |
Land used for agriculture |
| Forest area (% of land area) | API_AG.LND.FRST.ZS_DS2... |
Total forest coverage |
| CO₂ emissions (kt) | API_EN.ATM.CO2E.KT_DS2... |
Measured in kilotons |
| Urban population | API_SP.URB.TOTL_DS2... |
Number of people living in urban areas |
| Renewable energy consumption (%) | API_EG.FEC.RNEW.ZS_DS2... |
Share of renewables in energy use |
| Mortality rate (under 5) | API_SH.DYN.MORT_DS2... |
Under-5 mortality rate |
-
Reads CSV files
-
Selects countries of interest
-
Returns:
- Transposed dataframe (years as rows)
- Standard dataframe (countries as rows)
Calculates and prints:
- Summary statistics (
describe()) - Skewness
- Kurtosis
- Median
Creates line plots for all six countries across years 1990–2020.
Creates grouped bar charts for selected years:
- 2005, 2010, 2015, 2020
Generates a correlation heatmap for different indicators for a single country.
Countries visualized:
- India
- Germany
- Australia
The script creates and saves:
- Agriculture land
- Forest land
- CO₂ emissions
- Urban population
- Correlation matrix for each selected country
All images are saved as high-quality PNG files (dpi=300).
Install necessary libraries:
pip install numpy pandas matplotlib scipy
Place all CSV files in the working directory and run:
python main.py
| Country |
|---|
| France |
| India |
| Netherlands |
| Hungary |
| Germany |
| Australia |
-
Growth or decline in agricultural land use
-
Forest land changes over 30 years
-
CO₂ emissions trends across industrialized vs. developing economies
-
Urbanization growth patterns
-
Correlations between:
- Renewable energy use
- CO₂ emissions
- Mortality rates
- Forest/agriculture land changes
This analysis is suitable for:
- Data analysis assignments
- Environmental and economic studies
- Time-series trend visualization
- Understanding multi-indicator correlations