How to Build a Pivot Table From Our World in Data Demographics
Demographic datasets from Our World in Data are excellent for learning data analysis because they contain real-world statistics on population, life expectancy, fertility rates, migration, and age distribution across countries.
One of the fastest ways to summarize this data is with a pivot table. In this tutorial, you will use Python and Pandas in Google Colab to create a demographic pivot table from an Our World in Data dataset.
A pivot table is a data summarization tool used to reorganize, group, and analyze large datasets quickly.
It allows you to turn raw rows of data into meaningful summaries without changing the original dataset.
Why It’s Called a “Pivot” Table
The word “pivot” means you can rotate or rearrange the structure of the data:
- Rows can become columns
- Columns can become grouped categories
- Values can be summarized differently
You are essentially “pivoting” the dataset around specific fields.
Think of a pivot table as:
“A fast way to turn messy rows into organized insights.”
Instead of manually filtering and calculating data, the pivot table does the summarization automatically.
Step 1: Install and Import Pandas
Start by importing Pandas.
import pandas as pd
Step 2: Upload the Demographic Dataset
Download a CSV dataset from Our World in Data Demographics Datasets. Here we use Population Data.
For example, you can use datasets containing:
Population
Life expectancy
Fertility rates
Median age
Upload the file into Google Colab.
from google.colab import files
uploaded = files.upload()
Step 3: Load the CSV File
Load the uploaded dataset into a DataFrame.
file_name = list(uploaded.keys())[0]
df = pd.read_csv(file_name)
df.head()
Typical columns may include:
country
year
population
life_expectancy
Step 4: Inspect the Dataset
Check the available columns.
print(df.columns)
Example output:
Index(['country', 'year', 'population'], dtype='object')
Step 5: Create the Pivot Table
Now build a pivot table that summarizes population by year and country.
pivot_table = pd.pivot_table(
df,
values='population',
index='year',
columns='country',
aggfunc='sum'
)
pivot_table.head()
This creates:
Rows = years
Columns = countries
Values = total population
Step 6: Create a More Advanced Pivot Table
You can also calculate averages instead of totals.
For example, average life expectancy by continent.
pivot_life = pd.pivot_table(
df,
values='Population',
index='year',
columns='Entity',
aggfunc='mean'
)
pivot_life.head()
Note: Review other lessons and clean up the Nan Columns.
Step 7: Export the Pivot Table
Save the pivot table as a CSV file.
pivot_table.to_csv("population_pivot.csv")
Download it:
files.download("population_pivot.csv")
Why Pivot Tables Matter
Pivot tables help analysts quickly:
Compare demographic trends
Summarize millions of rows
Analyze population growth
Study regional patterns
Prepare dashboards for BI tools
They are widely used in data engineering, business intelligence, economics, healthcare analytics, and public policy reporting.
Final Thoughts
Using Pandas pivot tables with datasets from Our World in Data allows you to transform raw demographic data into structured insights in minutes.
For aspiring data analysts and data engineers, mastering pivot tables is one of the most practical skills you can develop for real-world reporting and analytics workflows.
Advance Your Career With 16 Python Projects in Data & ML — All for $288.
Comments
Post a Comment