How to Visualise the Relationship Between Two Economic Variables Using the World Bank API
Economic data becomes far more valuable when you can visualize relationships between variables.
Instead of manually downloading CSV files, professional analysts often use APIs to pull live economic data directly into Python workflows.
One of the best sources for this is the World Bank API.
Using the API allows you to:
Access updated economic indicators
Automate data collection
Build scalable ML pipelines
Analyze global datasets efficiently
Create real-time economic dashboards
In this tutorial, we will use the World Bank API to analyze the relationship between:
Internet Usage (%)
GDP Growth (%)
using Python and visualization tools.
Why Use the World Bank API?
The World Bank API gives direct programmatic access to thousands of global indicators.
You can retrieve:
GDP
Inflation
Population
Trade
Education
Internet usage
Energy access
Health indicators
without manually downloading spreadsheets.
This is extremely useful for:
Machine learning
Data engineering
Economic forecasting
Automated analytics pipelines
Step 1: Install Required Libraries
We will use:
pandas
wbdata
matplotlib
seaborn
Install them:
pip install wbdata pandas matplotlib seaborn
Step 2: Import the Libraries
import wbdata
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
The wbdata package simplifies interaction with the World Bank API.
Step 3: Define the Economic Indicators
Each World Bank indicator has a code.
For this project:
| Economic Variable | Indicator Code |
|---|---|
| GDP Growth (%) | NY.GDP.MKTP.KD.ZG |
| Internet Usage (%) | IT.NET.USER.ZS |
Define them:
indicators = {
"NY.GDP.MKTP.KD.ZG": "GDP_Growth",
"IT.NET.USER.ZS": "Internet_Usage"
}
Step 4: Pull Data from the World Bank API
Retrieve the data:
df = wbdata.get_dataframe(indicators)
Reset the index:
df = df.reset_index()
Preview the dataset:
print(df.head())
You now have live economic data directly from the World Bank.
Step 5: Clean the Dataset
Remove missing values:
df = df.dropna()
Check the remaining rows:
print(df.shape)
Real-world economic datasets often contain incomplete observations.
Cleaning is a critical step in analytics and machine learning.
Step 6: Filter Recent Years
To make the analysis more relevant, keep recent observations only.
df = df[df["date"] >= "2015"]
You can also filter by country.
Example:
kenya_df = df[df["country"] == "Kenya"]
Step 7: Create a Scatter Plot
Now visualize the relationship between internet usage and GDP growth.
plt.figure(figsize=(10,6))
plt.scatter(
df["Internet_Usage"],
df["GDP_Growth"]
)
plt.xlabel("Internet Usage (%)")
plt.ylabel("GDP Growth (%)")
plt.title("Internet Usage vs GDP Growth")
plt.show()
This scatter plot allows you to visually inspect the relationship between the two variables.
Step 8: Add a Regression Line
Regression lines help estimate overall trends.
sns.regplot(
x="Internet_Usage",
y="GDP_Growth",
data=df
)
plt.title("Internet Usage vs GDP Growth")
plt.show()
The line estimates the average direction of the relationship.
Understanding Correlation
Positive correlation means both variables tend to move together.
Example:
Higher internet adoption
Higher economic growth
A perfect positive relationship would appear as:
Most economic relationships are weaker and noisier because many external factors affect economies simultaneously.
Step 9: Calculate Correlation Numerically
You can calculate the correlation coefficient directly.
correlation = df["Internet_Usage"].corr(
df["GDP_Growth"]
)
print(correlation)
Correlation values range from:
| Value | Meaning |
|---|---|
| 1 | Perfect positive correlation |
| 0 | No correlation |
| -1 | Perfect negative correlation |
This gives a quantitative measure of the relationship.
Why This Matters for Machine Learning
Visualization is often the first stage of machine learning.
Before training models, analysts need to understand:
Variable relationships
Data quality
Outliers
Feature strength
Predictor usefulness
Economic visualization helps improve:
Feature engineering
Model selection
Forecasting accuracy
Business interpretation
This is why data visualization is foundational in:
Data science
Economics
Financial analytics
Business intelligence
Machine learning
The World Bank API is one of the best free resources for learning economic analytics and machine learning.
By combining:
API-driven data collection
Pandas
Visualization
Correlation analysis
you can build professional-grade analytics workflows using real-world global economic data.
Learning how to visualize relationships between economic variables is one of the first major steps toward becoming a strong data analyst or machine learning engineer.
Advance Your Career With 16 Python Projects in Data & ML — All for $288.
Comments
Post a Comment