How to Build a Histogram That Actually Tells a Story

A histogram is a type of graph used in statistics to show how data is distributed.



It looks similar to a bar chart, but it represents numerical data grouped into ranges (called bins), rather than categories.

Key idea

A histogram answers the question:
👉 “How many data values fall into each range?”

How it works

  • The horizontal axis (x-axis) shows intervals or ranges of values (for example: 0–10, 10–20, 20–30).
  • The vertical axis (y-axis) shows the frequency (how many data points fall in each range).
  • Each bar touches the next one (no gaps), because the data is continuous.

Most people build histograms incorrectly.



They load a dataset, generate bars, and stop there. The result is usually a chart with no narrative, no analytical value, and no insight beyond “these numbers exist.”

A good histogram should explain something important immediately.

In this tutorial, you will use real economic data from the World Bank to build a histogram in Google Colab that tells a clear story about African economies.

We will use GDP per capita data to answer a meaningful question:

How is wealth distributed across African countries?

That single question transforms the histogram from a technical exercise into a business and economic analysis tool.

You can download the dataset directly from the World Bank Dataset. 





The indicator measures:

GDP per capita (current US dollars)


This represents the average economic output generated per person in a country.

After downloading the CSV file, upload it into Google Colab using:

import pandas as pd
from google.colab import files

uploaded = files.upload()




Once uploaded, load the dataset:

file_name = list(uploaded.keys())[0]

df = pd.read_csv(file_name, skiprows=4)



The skiprows=4 argument is important because World Bank CSV files include metadata rows before the actual table begins.

Next, filter the dataset to African countries:

african_countries = [
    "Kenya", "Nigeria", "South Africa", "Ghana",
    "Ethiopia", "Uganda", "Tanzania", "Rwanda",
    "Botswana", "Senegal", "Zambia"
]

df_africa = df[df['Country Name'].isin(african_countries)]



Now select the latest GDP per capita values:

gdp_data = df_africa[['Country Name', '2024']]

gdp_data = gdp_data.dropna()



Finally, create the histogram:

import matplotlib.pyplot as plt

plt.figure(figsize=(10,6))

plt.hist(gdp_data['2024'], bins=6)

plt.xlabel("GDP Per Capita (US Dollars)")
plt.ylabel("Number of Countries")
plt.title("Distribution of GDP Per Capita Across African Countries")

plt.show()






This is where the real analytical work begins.

A histogram tells a story through distribution.

When you look at this chart, you will likely notice that many African countries cluster in the lower GDP-per-capita ranges, while only a few appear in the higher ranges.

That shape matters.

It reveals economic concentration and inequality immediately.

For example:

  • if most bars are crowded on the left side, the data is skewed toward lower-income economies,

  • if the bars spread evenly, economic output is more balanced,

  • if there are isolated bars far to the right, a few countries significantly outperform the rest economically.

This is why histograms are powerful in economics, business intelligence, and policy analysis.

The bars are not the story.

The distribution is the story.

A weak histogram simply visualizes numbers.

A strong histogram explains patterns.

This distinction is critical in data analytics because decision-makers rarely care about charts alone.

They care about what the distribution reveals:

  • inequality,

  • concentration,

  • volatility,

  • market segmentation,

  • or operational patterns.

For example, the exact same histogram technique can be used for:

  • mobile money transaction sizes in Kenya,

  • export values across African countries,

  • rainfall distributions in East Africa,

  • survey respondent ages,

  • or startup funding amounts across regions.

The goal is always the same: use distribution to reveal structure inside the data.

That is what makes a histogram useful instead of decorative.



Advance Your Career With 16 Python Projects in Data & ML — All for $288.

Comments

Popular posts from this blog

How to Filter Rows Using Boolean Indexing in Pandas (Afrobarometer Kenya Dataset)

How to Decide Whether to Drop or Fill Missing Value

How to create your first line chart with World Bank Kenya GDP data