How to Visualise the Relationship Between Two Economic Variables Using the World Bank API

Economic data becomes far more valuable when you can visualize relationships between variables.




Instead of manually downloading CSV files, professional analysts often use APIs to pull live economic data directly into Python workflows.

One of the best sources for this is the World Bank API.


Using the API allows you to:

  • Access updated economic indicators

  • Automate data collection

  • Build scalable ML pipelines

  • Analyze global datasets efficiently

  • Create real-time economic dashboards


In this tutorial, we will use the World Bank API to analyze the relationship between:

  • Internet Usage (%)

  • GDP Growth (%)

using Python and visualization tools.


Why Use the World Bank API?

The World Bank API gives direct programmatic access to thousands of global indicators.

You can retrieve:

  • GDP

  • Inflation

  • Population

  • Trade

  • Education

  • Internet usage

  • Energy access

  • Health indicators

without manually downloading spreadsheets.


This is extremely useful for:

  • Machine learning

  • Data engineering

  • Economic forecasting

  • Automated analytics pipelines


Step 1: Install Required Libraries

We will use:

  • pandas

  • wbdata

  • matplotlib

  • seaborn

Install them:

pip install wbdata pandas matplotlib seaborn



Step 2: Import the Libraries

import wbdata
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

The wbdata package simplifies interaction with the World Bank API.



Step 3: Define the Economic Indicators

Each World Bank indicator has a code.

For this project:

Economic Variable            Indicator Code
GDP Growth (%)NY.GDP.MKTP.KD.ZG
Internet Usage (%)IT.NET.USER.ZS

Define them:

indicators = {
    "NY.GDP.MKTP.KD.ZG": "GDP_Growth",
    "IT.NET.USER.ZS": "Internet_Usage"
}



Step 4: Pull Data from the World Bank API

Retrieve the data:

df = wbdata.get_dataframe(indicators)

Reset the index:

df = df.reset_index()

Preview the dataset:

print(df.head())


You now have live economic data directly from the World Bank.


Step 5: Clean the Dataset

Remove missing values:

df = df.dropna()

Check the remaining rows:

print(df.shape)



Real-world economic datasets often contain incomplete observations.

Cleaning is a critical step in analytics and machine learning.


Step 6: Filter Recent Years

To make the analysis more relevant, keep recent observations only.

df = df[df["date"] >= "2015"]

You can also filter by country.

Example:

kenya_df = df[df["country"] == "Kenya"]



Step 7: Create a Scatter Plot

Now visualize the relationship between internet usage and GDP growth.

plt.figure(figsize=(10,6))

plt.scatter(
    df["Internet_Usage"],
    df["GDP_Growth"]
)

plt.xlabel("Internet Usage (%)")
plt.ylabel("GDP Growth (%)")
plt.title("Internet Usage vs GDP Growth")

plt.show()



This scatter plot allows you to visually inspect the relationship between the two variables.


Step 8: Add a Regression Line

Regression lines help estimate overall trends.

sns.regplot(
    x="Internet_Usage",
    y="GDP_Growth",
    data=df
)

plt.title("Internet Usage vs GDP Growth")
plt.show()



The line estimates the average direction of the relationship.


Understanding Correlation

Positive correlation means both variables tend to move together.

Example:

  • Higher internet adoption

  • Higher economic growth

A perfect positive relationship would appear as:

Most economic relationships are weaker and noisier because many external factors affect economies simultaneously.


Step 9: Calculate Correlation Numerically

You can calculate the correlation coefficient directly.

correlation = df["Internet_Usage"].corr(
    df["GDP_Growth"]
)

print(correlation)


Correlation values range from:

Value                        Meaning
1Perfect positive correlation
0No correlation
-1Perfect negative correlation

This gives a quantitative measure of the relationship.


Why This Matters for Machine Learning

Visualization is often the first stage of machine learning.

Before training models, analysts need to understand:

  • Variable relationships

  • Data quality

  • Outliers

  • Feature strength

  • Predictor usefulness


Economic visualization helps improve:

  • Feature engineering

  • Model selection

  • Forecasting accuracy

  • Business interpretation


This is why data visualization is foundational in:

  • Data science

  • Economics

  • Financial analytics

  • Business intelligence

  • Machine learning


The World Bank API is one of the best free resources for learning economic analytics and machine learning.

By combining:

  • API-driven data collection

  • Pandas

  • Visualization

  • Correlation analysis

you can build professional-grade analytics workflows using real-world global economic data.

Learning how to visualize relationships between economic variables is one of the first major steps toward becoming a strong data analyst or machine learning engineer.



Advance Your Career With 16 Python Projects in Data & ML — All for $288.




Comments

Popular posts from this blog

How to Filter Rows Using Boolean Indexing in Pandas (Afrobarometer Kenya Dataset)

How to Build a Pivot Table From Our World in Data Demographics

How to Decide Whether to Drop or Fill Missing Value