How to use df.head(), df.info(), and df.describe() to explore any dataset
Learn how to quickly explore any dataset in Python using df.head(), df.info(), and df.describe() for a fast overview of data structure, types, and summary statistics.
When you first load a dataset in Python with pandas, it’s crucial to understand its structure and contents. Three core commands give you a rapid overview:
1. df.head()
Shows the first 5 rows by default (you can pass a number to view more).
Helps check if the data loaded correctly and inspect sample values.
import pandas as pd
df = pd.read_csv("your_dataset.csv")
print(df.head())
print(df.head(10)) # first 10 rows
2. df.info()
Displays a concise summary of the DataFrame.
Key details: number of rows, columns, column names, non-null counts, and data types.
df.info()
Output includes:
Total rows and columns
Column names
Data type of each column (
int64,float64,object, etc.)Number of non-null entries (useful for spotting missing data)
3. df.describe()
Provides summary statistics for numerical columns.
Includes count, mean, std (standard deviation), min, max, and quartiles (25%, 50%, 75%).
df.describe()
Optional: Include include='all' to get statistics for all columns, including categorical ones:
df.describe(include='all')
Key Takeaways:
df.head()→ preview sample rowsdf.info()→ understand structure, types, missing valuesdf.describe()→ get numerical summaries quickly
These three commands give a fast, reliable first look at any dataset before deeper analysis.
Advance Your Career With 16 Python Projects in Data & ML — All for $288.
Comments
Post a Comment