How to Visualise the Relationship Between Two Economic Variables Using the World Bank API

May 29, 2026

Economic data becomes far more valuable when you can visualize relationships between variables.

Instead of manually downloading CSV files, professional analysts often use APIs to pull live economic data directly into Python workflows.

One of the best sources for this is the World Bank API.

Using the API allows you to:

Access updated economic indicators
Automate data collection
Build scalable ML pipelines
Analyze global datasets efficiently
Create real-time economic dashboards

In this tutorial, we will use the World Bank API to analyze the relationship between:

Internet Usage (%)
GDP Growth (%)

using Python and visualization tools.

Why Use the World Bank API?

The World Bank API gives direct programmatic access to thousands of global indicators.

You can retrieve:

GDP
Inflation
Population
Trade
Education
Internet usage
Energy access
Health indicators

without manually downloading spreadsheets.

This is extremely useful for:

Machine learning
Data engineering
Economic forecasting
Automated analytics pipelines

Step 1: Install Required Libraries

We will use:

pandas
wbdata
matplotlib
seaborn

Install them:

pip install wbdata pandas matplotlib seaborn

Step 2: Import the Libraries

import wbdata
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

The wbdata package simplifies interaction with the World Bank API.

Step 3: Define the Economic Indicators

Each World Bank indicator has a code.

For this project:

Economic Variable	Indicator Code
GDP Growth (%)	NY.GDP.MKTP.KD.ZG
Internet Usage (%)	IT.NET.USER.ZS

Define them:

indicators = {
    "NY.GDP.MKTP.KD.ZG": "GDP_Growth",
    "IT.NET.USER.ZS": "Internet_Usage"
}

Step 4: Pull Data from the World Bank API

Retrieve the data:

df = wbdata.get_dataframe(indicators)

Reset the index:

df = df.reset_index()

Preview the dataset:

print(df.head())

You now have live economic data directly from the World Bank.

Step 5: Clean the Dataset

Remove missing values:

df = df.dropna()

Check the remaining rows:

print(df.shape)

Real-world economic datasets often contain incomplete observations.

Cleaning is a critical step in analytics and machine learning.

Step 6: Filter Recent Years

To make the analysis more relevant, keep recent observations only.

df = df[df["date"] >= "2015"]

You can also filter by country.

Example:

kenya_df = df[df["country"] == "Kenya"]

Step 7: Create a Scatter Plot

Now visualize the relationship between internet usage and GDP growth.

plt.figure(figsize=(10,6))

plt.scatter(
    df["Internet_Usage"],
    df["GDP_Growth"]
)

plt.xlabel("Internet Usage (%)")
plt.ylabel("GDP Growth (%)")
plt.title("Internet Usage vs GDP Growth")

plt.show()

This scatter plot allows you to visually inspect the relationship between the two variables.

Step 8: Add a Regression Line

Regression lines help estimate overall trends.

sns.regplot(
    x="Internet_Usage",
    y="GDP_Growth",
    data=df
)

plt.title("Internet Usage vs GDP Growth")
plt.show()

The line estimates the average direction of the relationship.

Understanding Correlation

Positive correlation means both variables tend to move together.

Example:

Higher internet adoption
Higher economic growth

A perfect positive relationship would appear as:

Most economic relationships are weaker and noisier because many external factors affect economies simultaneously.

Step 9: Calculate Correlation Numerically

You can calculate the correlation coefficient directly.

correlation = df["Internet_Usage"].corr(
    df["GDP_Growth"]
)

print(correlation)

Correlation values range from:

Value	Meaning
1	Perfect positive correlation
0	No correlation
-1	Perfect negative correlation

This gives a quantitative measure of the relationship.

Why This Matters for Machine Learning

Visualization is often the first stage of machine learning.

Before training models, analysts need to understand:

Variable relationships
Data quality
Outliers
Feature strength
Predictor usefulness

Economic visualization helps improve:

Feature engineering
Model selection
Forecasting accuracy
Business interpretation

This is why data visualization is foundational in:

Data science
Economics
Financial analytics
Business intelligence
Machine learning

The World Bank API is one of the best free resources for learning economic analytics and machine learning.

By combining:

API-driven data collection
Pandas
Visualization
Correlation analysis

you can build professional-grade analytics workflows using real-world global economic data.

Learning how to visualize relationships between economic variables is one of the first major steps toward becoming a strong data analyst or machine learning engineer.

Advance Your Career With 16 Python Projects in Data & ML — All for $288.

Search This Blog

Practical Python for Data Engineering, Data Analysis & Machine Learning

How to Visualise the Relationship Between Two Economic Variables Using the World Bank API

Why Use the World Bank API?

Step 1: Install Required Libraries

Step 2: Import the Libraries

Step 3: Define the Economic Indicators

Step 4: Pull Data from the World Bank API

Step 5: Clean the Dataset

Step 6: Filter Recent Years

Step 7: Create a Scatter Plot

Step 8: Add a Regression Line

Understanding Correlation

Step 9: Calculate Correlation Numerically

Why This Matters for Machine Learning

Comments

Post a Comment

Popular posts from this blog

How to Filter Rows Using Boolean Indexing in Pandas (Afrobarometer Kenya Dataset)

How to Build a Pivot Table From Our World in Data Demographics

How to Decide Whether to Drop or Fill Missing Value