Lesson 4 - Measures of Variability

Welcome to Measures of Variability

In the last lessons you learned to find the center of a distribution — the mean, the median, the mode. But the center alone can hide as much as it reveals. Two groups can share an identical average while one is tightly bunched and the other sprawls all over the place. To describe data honestly, you need a second number: a measure of how spread out the values are.

In this lesson you will build that second number from the ground up using a real dataset of cars, starting with the crude-but-quick range and working toward the standard deviation, the most widely used measure of spread in all of statistics.

By the end of this lesson, you will be able to:

  • Explain why the center of a distribution is not enough on its own
  • Compute and interpret the range, interquartile range, and mean absolute deviation
  • Calculate variance and standard deviation, and explain why standard deviation is preferred
  • Distinguish population (N N ) from sample (n1 n-1 ) formulas and explain why we divide by n1 n-1

You only need a little Python and pandas. Let’s begin.


Why the Center Is Not Enough

Imagine two columns of numbers that both average to about 23.5. From the mean alone they look identical. But suppose one column hugs that average closely while the other swings wildly above and below it. Reporting only the mean would treat them as the same — and badly mislead anyone who relied on it.

Two overlaid histograms sharing the same mean of about 23.5; one is narrow and tall, the other is wide and flat.
Two distributions with the same mean (≈ 23.5) but very different spread. The center cannot tell them apart — you need a measure of variability to do that.

The whole job of a measure of variability (also called a measure of spread or dispersion) is to capture that difference in a single number: how far, typically, do the values stray from the center?

Let’s load the dataset we’ll use throughout. It records 398 cars, including each car’s fuel economy in miles per gallon (mpg), weight, horsepower, number of cylinders, and region of origin.

import pandas as pd

cars = pd.read_csv("https://datatweets.com/datasets/cars.csv")
print(cars.shape)
print(cars[["mpg", "weight", "origin"]].head())
(398, 9)
    mpg  weight origin
0  18.0    3504    usa
1  15.0    3693    usa
2  18.0    3436    usa
3  16.0    3433    usa
4  17.0    3449    usa

Our focus is the mpg column. Its mean is about 23.5 miles per gallon — but how much do real cars vary around that figure? That is the question every measure below answers in a different way.

print(round(cars["mpg"].mean(), 2))
23.51

The Range

The simplest measure of spread is the range: the distance between the largest and smallest values.

range=max(x)min(x) \text{range} = \max(x) - \min(x)
mpg = cars["mpg"]
print(mpg.min(), mpg.max())
print(round(mpg.max() - mpg.min(), 1))
9.0 46.6
37.6

The thirstiest car gets just 9.0 mpg, the most efficient 46.6, so the range is 37.6 mpg. That one number tells you the full width of the data at a glance.

The range is fast and intuitive, but it has a serious weakness: it depends on only two values, the two most extreme ones. A single unusual car — one freakishly efficient prototype — can blow the range up while telling you nothing about the other 397 cars. Because it ignores everything in between, the range is sensitive to outliers and gives no sense of how the bulk of the data behaves. The next measure fixes exactly that.


The Interquartile Range

The interquartile range (IQR) measures the spread of the middle 50% of the data, deliberately ignoring the extremes. To find it, sort the values and locate the quartiles: the first quartile Q1 Q_1 is the 25th percentile (a quarter of the values fall below it) and the third quartile Q3 Q_3 is the 75th percentile.

A strip of sorted data points showing the full range bracket from minimum to maximum compared with the narrower IQR bracket from Q1 to Q3 covering the middle 50%.
The range stretches from the minimum to the maximum and is swayed by extremes, while the IQR spans only Q1 to Q3 — the middle 50% — and shrugs off outliers.
IQR=Q3Q1 \text{IQR} = Q_3 - Q_1
q1 = mpg.quantile(0.25)
q3 = mpg.quantile(0.75)
print("Q1 =", q1, " Q3 =", q3)
print("IQR =", round(q3 - q1, 1))
Q1 = 17.5  Q3 = 29.0
IQR = 11.5

So the middle half of cars sit between 17.5 and 29.0 mpg, an IQR of 11.5 mpg. Compare that to the full range of 37.6: the middle 50% of cars occupy a much narrower band than the extremes suggest.

The IQR’s great strength is that it is robust — resistant to outliers. Because it only looks at the 25th and 75th percentiles, no single extreme value can move it. That is why the IQR is the spread measure behind the box plot, and why it pairs naturally with the median (the 50th percentile) when describing skewed data.

Range vs. IQR

The range measures the full width of the data and is swayed by a single extreme value. The IQR measures the width of the central half and shrugs off extremes. When a distribution has outliers or a long tail, the IQR describes “typical” spread far more faithfully than the range.


Mean Absolute Deviation

The range and IQR look at a couple of landmark values. A more complete idea is to ask: on average, how far is each value from the center? That is the mean absolute deviation (MAD).

For each value you compute its deviation from the mean, xixˉ x_i - \bar{x} . Some deviations are positive (above the mean) and some negative (below). If you simply averaged them they would cancel to zero — the mean is, by definition, the balance point. To stop the cancellation, you take the absolute value of each deviation before averaging:

MAD=1ni=1nxixˉ \text{MAD} = \frac{1}{n}\sum_{i=1}^{n} \lvert x_i - \bar{x} \rvert
mean = mpg.mean()
mad = (mpg - mean).abs().mean()
print(round(mad, 2))
6.53

So the typical car’s fuel economy sits about 6.53 mpg away from the average of 23.51. That is wonderfully easy to interpret: it is a real average distance, in the original units of mpg.

The MAD is honest and intuitive, but it has one awkward property: the absolute-value function is hard to work with mathematically (it has a sharp corner and no clean derivative). For that reason statisticians more often square the deviations instead of taking their absolute value — which leads us to the two most important measures of spread.


Variance and Standard Deviation

The variance also averages the deviations from the mean, but it squares each one before averaging. Squaring kills the negative signs (just like absolute value) and, as a bonus, gives extra weight to values that lie far from the center.

A number line with data points and a vertical mean line, signed deviation segments from each point to the mean, and small squares representing the squared deviations leading to variance and standard deviation.
Each point's signed deviation from the mean becomes a square (its area); averaging those squares gives the variance, and the square root of that is the standard deviation.

For a full population of N N values with mean μ \mu :

σ2=1Ni=1N(xiμ)2 \sigma^2 = \frac{1}{N}\sum_{i=1}^{N} (x_i - \mu)^2

The symbol σ2 \sigma^2 (sigma squared) is the conventional name for a population variance.

import numpy as np

deviations = mpg - mpg.mean()
squared = deviations ** 2
variance_pop = squared.mean()        # divide by N
print(round(variance_pop, 2))
60.94

That number — about 60.94 — is in squared mpg, which is impossible to picture. What does “60.94 squared miles per gallon” even mean? Nothing intuitive. This is the price of squaring.

To return to the original units, we take the square root. The result is the standard deviation, written σ \sigma for a population:

σ=σ2=1Ni=1N(xiμ)2 \sigma = \sqrt{\sigma^2} = \sqrt{\frac{1}{N}\sum_{i=1}^{N}(x_i - \mu)^2}
std_pop = np.sqrt(variance_pop)
print(round(std_pop, 2))
7.81

The standard deviation is 7.81 mpg — back in plain miles per gallon, and therefore directly comparable to the mean. It says the typical car’s fuel economy lands roughly 7.8 mpg from the average. This is why standard deviation, not variance, is the number you usually report: it lives in the same units as your data.

Notice it is a little larger than the MAD (6.53). That is no accident — because variance squares the deviations, far-away values count for more, so the standard deviation is always at least as large as the mean absolute deviation.

Histogram of car fuel economy with a vertical line at the mean and a shaded band one standard deviation on each side.
The distribution of mpg with the mean (≈ 23.5) marked and a shaded band one standard deviation wide on each side. About 63% of cars fall inside this ±7.8 mpg band.

Population vs. Sample: Why We Divide by n − 1

So far we divided the sum of squared deviations by N N , the count of values. That is correct when your data is the entire population. But most of the time your data is a sample drawn from a larger population, and you are using it to estimate the population’s spread. In that case dividing by N N produces an estimate that is biased slightly too small.

The fix is Bessel’s correction: divide by n1 n - 1 instead of n n . The sample variance is

s2=1n1i=1n(xixˉ)2 s^2 = \frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})^2

and the sample standard deviation s s is its square root. Here s s and xˉ \bar{x} (sample mean) are the everyday estimators you compute from data in hand, while σ \sigma and μ \mu describe the unseen population.

The intuition for n1 n - 1 : you are measuring deviations from the sample mean xˉ \bar{x} , not the true population mean μ \mu . The sample mean is, by construction, the value that makes those squared deviations as small as possible — it sits right in the middle of your data. So the spread around xˉ \bar{x} is always a touch smaller than the spread around the real μ \mu . Dividing by the slightly smaller n1 n - 1 nudges the estimate up just enough to compensate. You can think of it as “spending” one degree of freedom to estimate the mean before you can estimate the spread.

In pandas, this matters because of a quiet but important default: .var() and .std() use ddof=1 (the sample formula, n1 n-1 ) out of the box. To get the population formula you must ask for ddof=0 explicitly.

print("sample   (ddof=1):", round(mpg.var(), 2), round(mpg.std(), 2))
print("population (ddof=0):", round(mpg.var(ddof=0), 2), round(mpg.std(ddof=0), 2))
sample   (ddof=1): 61.09 7.82
population (ddof=0): 60.94 7.81

With 398 cars the two answers barely differ — sample std 7.82 versus population std 7.81 — because dividing by 397 instead of 398 changes almost nothing when n n is large. But for small samples the gap is real and the correction matters. The lesson to carry away: know which one pandas is giving you. By default it assumes you have a sample.

The pandas default that trips people up

Series.std() and Series.var() default to ddof=1 — the sample standard deviation and variance. NumPy’s np.std() and np.var() default to the opposite, ddof=0 (the population formula). If you compare results across libraries and they disagree slightly, this is almost always why. Set ddof explicitly whenever it matters.


Comparing Spread Across Groups

A measure of spread becomes most useful when you compare it across groups. Do cars from one region vary more in fuel economy than another? Standard deviation answers this directly. Let’s compute the mean and standard deviation of mpg for each region of origin.

summary = cars.groupby("origin")["mpg"].agg(["count", "mean", "std"]).round(2)
print(summary)
        count   mean   std
origin
europe     70  27.89  6.72
japan      79  30.45  6.09
usa       249  20.08  6.40

Two stories emerge, and they are different stories. The means say Japanese cars are the most fuel-efficient (30.45 mpg) and American cars the least (20.08 mpg). But the standard deviations — the spreads — are much closer together: European cars are the most variable (6.72), Japanese the least (6.09), with American cars in between (6.40).

This is a perfect illustration of why center and spread are independent ideas. American cars have by far the lowest average mpg, yet they are not the most variable group — Europe is. Knowing one number tells you nothing about the other. If you only compared means you would miss that European fuel economy is the least predictable of the three.

spread = cars.groupby("origin")["mpg"].agg(
    range_mpg=lambda x: x.max() - x.min(),
    std_mpg="std",
).round(2)
print(spread)
        range_mpg  std_mpg
origin
europe       28.1     6.72
japan        28.6     6.09
usa          30.0     6.40

Notice that by range, American cars look the most spread out (30.0) — driven by a few extreme cars at each end — even though by standard deviation Europe edges ahead. A reminder that different measures of spread can rank the same groups differently, and that you should choose the one that fits your question.


Practice Exercises

Exercise 1: Spread of car weight

Compute the range, interquartile range, and sample standard deviation of the weight column. Which measure of spread would you trust most if a handful of very heavy cars were dragging the high end, and why?

Hint

The range is cars["weight"].max() - cars["weight"].min(), the IQR is cars["weight"].quantile(0.75) - cars["weight"].quantile(0.25), and the standard deviation is cars["weight"].std(). The robust measure is the one that ignores extreme values.

Exercise 2: Population vs. sample by hand

Using only the squared deviations, reproduce both the population variance (ddof=0) and the sample variance (ddof=1) of mpg without calling .var(). Confirm your two numbers match mpg.var(ddof=0) and mpg.var(ddof=1).

Hint

Compute sq = ((mpg - mpg.mean()) ** 2).sum(), then divide by len(mpg) for the population version and by len(mpg) - 1 for the sample version. They should equal 60.94 and 61.09.

Exercise 3: Most consistent cylinder group

Group the cars by number of cylinders and compute the standard deviation of mpg for each group. Which cylinder count has the most consistent (least variable) fuel economy? Does the group with the highest average mpg also have the smallest spread?

Hint

Use cars.groupby("cylinders")["mpg"].agg(["mean", "std"]). The most consistent group is the one with the smallest std. Compare it against the group with the largest mean to see whether they are the same.


Summary

The center of a distribution is only half the story; to describe data faithfully you also need its spread. The range is quick but fragile, swayed by a single extreme value. The interquartile range measures the middle 50% and is robust to outliers. The mean absolute deviation averages how far values fall from the mean in their original units. Variance averages the squared deviations, and the standard deviation — its square root — brings that back to the original units, making it the most widely reported measure of spread. Finally, dividing by n1 n - 1 instead of N N (Bessel’s correction) gives an unbiased estimate when your data is a sample rather than a whole population — the default pandas assumes.

Key Concepts

  • Range — the difference between the maximum and minimum value; simple but outlier-sensitive.
  • Interquartile range (IQR)Q3Q1 Q_3 - Q_1 , the spread of the central 50%; robust to extremes.
  • Mean absolute deviation (MAD) — the average absolute distance of values from the mean, in original units.
  • Variance — the average squared deviation from the mean (σ2 \sigma^2 for a population, s2 s^2 for a sample).
  • Standard deviation — the square root of the variance, expressed in the data’s original units.
  • Bessel’s correction — dividing by n1 n - 1 for a sample to correct the underestimate of population spread.
  • ddof — the pandas/NumPy argument that selects n1 n-1 (ddof=1, sample) or N N (ddof=0, population).

Why This Matters

Variability is where the risk lives. Two investments, two manufacturing lines, or two model predictions can share an average and yet behave completely differently — one steady, one volatile. Standard deviation is the language analysts use to quantify that volatility, and the sample-versus-population distinction underlies nearly every statistical test, confidence interval, and machine-learning metric you will meet later. Master spread now and the rest of statistics rests on solid ground.


Next Steps

Continue to Lesson 5 - Z-Scores and Standardization

Use the mean and standard deviation together to measure how unusual any single value is, and put different variables on a common scale.

Back to Module Overview

Return to the Measures of Center & Variability module overview


Continue Building Your Skills

You can now describe not just where a distribution sits but how widely it spreads — and you know exactly which number pandas hands you when you call .std(). Next you will combine the mean and standard deviation into a single powerful idea, the z-score, that tells you how unusual any one value is and lets you compare numbers measured on entirely different scales.