How to Produce 8 Meaningful EDA Visualisations in Under an Hour Using World Bank GDP Data
Exploratory Data Analysis (EDA) is one of the fastest ways to understand economic data.
With just Pandas, Matplotlib, and Seaborn, you can produce high-quality insights from World Bank GDP datasets in under an hour.
In this tutorial, we will use World Bank GDP data to create 8 meaningful visualizations commonly used in:
economics,
business intelligence,
public policy,
and data science.
Upload and Load the Dataset
We will assume you downloaded GDP data from the World Bank Data Catalog.
Import Visualization Libraries
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style("whitegrid")
1. Histogram — Understand GDP Distribution
Histograms reveal how GDP values are distributed globally.
This quickly reveals:
skewness,
outliers,
and economic concentration.
Most GDP datasets are heavily right-skewed.
2. Bar Chart — Top 10 Largest Economies
Bar charts are excellent for rankings.
This helps identify dominant economies instantly.
3. Line Chart — GDP Growth Over Time
Line charts are essential for time-series analysis.
# Get all year columns (assuming they are strings that can be converted to int)year_cols = [col for col in df.columns if col.isdigit() and int(col) >= 1960 and int(col) <= 2025]
# Melt the DataFrame to have years as a columndf_melted = df.melt(id_vars=['Country Name', 'Country Code', 'Indicator Name', 'Indicator Code'], value_vars=year_cols, var_name='Year', value_name='GDP')
# Convert 'Year' column to numericdf_melted['Year'] = pd.to_numeric(df_melted['Year'])
# Filter for Kenya using the correct column name 'Country Name'kenya = df_melted[df_melted['Country Name'] == 'Kenya']
plt.figure(figsize=(10,5))
sns.lineplot( data=kenya, x='Year', # Use the new 'Year' column y='GDP' # Use the new 'GDP' column)
plt.title('Kenya GDP Over Time')plt.xlabel('Year')plt.ylabel('GDP (current US$)')
plt.show()
Perfect for:
economic growth analysis,
forecasting,
and trend detection.
4. Box Plot — Detect Economic Outliers
Box plots expose extreme GDP values.
You will immediately see:
ultra-large economies,
global inequality,
and distribution spread.
5. Scatter Plot- GDP per Capita: 2021 vs 2022
Scatter plots show relationships between variables.
The scatter plot comparing GDP per capita for 2021 and 2022 reveals:
- Economic Consistency: Most countries tend to maintain similar GDP per capita levels year over year, forming a diagonal trend on the plot.
- Growth or Decline: Countries positioned above the diagonal line have experienced an increase in GDP per capita from 2021 to 2022, while those below have seen a decrease.
- Significant Shifts: Outliers or points far from the main diagonal highlight countries that have undergone substantial economic changes, either rapid growth or significant downturns, in that one-year period.
6. Heatmap — Correlation Analysis
Correlation heatmaps help identify relationships across metrics.
selected_years = ['2000', '2010', '2020', '2021', '2022']corr = df[selected_years].corr()
plt.figure(figsize=(8,6))
sns.heatmap(corr, annot=True, cmap='coolwarm', fmt=".2f")
plt.title('GDP per Capita Correlations Across Selected Years')plt.show()
Useful for:
feature selection,
model preparation,
and economic research.
7. Pair Plot — Multi-Variable Exploration
Pair plots automate multiple visual comparisons.
selected_years_for_pairplot = ['2000', '2010', '2020', '2021', '2022']sns.pairplot( df[selected_years_for_pairplot].dropna())
plt.suptitle('Pair Plot of GDP per Capita Across Selected Years', y=1.02)plt.show()This creates:
scatter plots,
histograms,
and variable relationships automatically.
Extremely useful during rapid EDA.
8. Violin Plot — Distribution by Region
Violin plots combine density and distribution analysis.
This reveals:
regional inequality,
spread,
and concentration patterns.
Why These 8 Visualizations Matter
Together, these charts help answer critical questions:
| Visualization | Main Purpose |
|---|---|
| Histogram | Distribution analysis |
| Bar Chart | Rankings |
| Line Chart | Trends over time |
| Box Plot | Outlier detection |
| Scatter Plot | Relationships |
| Heatmap | Correlation analysis |
| Pair Plot | Multi-variable EDA |
| Violin Plot | Distribution comparison |
These are foundational visualizations used in:
data science,
machine learning,
BI dashboards,
and economic analytics.
You do not need advanced tools to produce meaningful EDA quickly. With:
Pandas,
Matplotlib,
and Seaborn,
you can generate professional-grade economic insights in less than an hour.
World Bank GDP data is especially powerful because it contains:
long-term trends,
global comparisons,
economic inequality patterns,
and strong statistical relationships.
Mastering these 8 visualizations gives you a strong foundation for deeper analytics, forecasting, and machine learning workflows.
Advance Your Career With 16 Python Projects in Data & ML — All for $288.
Comments
Post a Comment