Lesson 1 - The Mean

Welcome to The Mean

Ask anyone for “the average” and they will hand you the mean: add up the numbers, divide by how many there are. It is the most reported statistic in the world, and the one we trust almost without thinking. But that easy familiarity hides a few sharp edges — the mean can be pulled around by a single extreme value, and a single overall average can quietly bury big differences between groups.

In this lesson you will compute the mean by hand and with pandas, see why it sits at the exact balance point of your data, watch it get tugged by skew, and discover that an overall mean is really a blend of its subgroups in disguise. You will use a real dataset of 1970s and 80s cars to do it.

By the end of this lesson, you will be able to:

Write the formula for the mean and tell a population mean $\mu$ apart from a sample mean $\bar{x}$
Compute the mean of a column in pandas and reproduce it from the raw formula
Explain why the mean is the data’s balance point — the deviations around it sum to zero
Recognize when the mean is misleading, and see how an overall mean blends its subgroup means

You only need a little Python and pandas. Let’s begin.

What the Mean Is

The arithmetic mean is the sum of all the values divided by how many values there are. For a set of $n$ numbers $x_1, x_2, \dots, x_n$ , the mean is:

\bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_i

That symbol $\bar{x}$ (“x-bar”) is the sample mean — the mean of the data you actually have in front of you. When the numbers represent an entire population rather than a sample, we write the mean as $\mu$ (the Greek letter “mu”) and use $N$ for the population size:

\mu = \frac{1}{N}\sum_{i=1}^{N} x_i

The arithmetic is identical; the symbols just tell the reader whether you are describing a sample or a whole population. As you saw in the first module, $\mu$ is a fixed fact about a population, while $\bar{x}$ is an estimate that shifts from sample to sample.

Let’s load the cars dataset and meet the column we will summarize.

import pandas as pd

cars = pd.read_csv("https://datatweets.com/datasets/cars.csv")
print(cars.shape)
print(cars[["mpg", "cylinders", "weight", "origin", "name"]].head())

(398, 9)
    mpg  cylinders  weight origin                       name
0  18.0          8    3504    usa  chevrolet chevelle malibu
1  15.0          8    3693    usa          buick skylark 320
2  18.0          8    3436    usa         plymouth satellite
3  16.0          8    3433    usa              amc rebel sst
4  17.0          8    3449    usa                ford torino

These are 398 cars from the 1970s and early 80s, each with its fuel economy (mpg, miles per gallon), engine size, weight, and region of origin. We will focus on mpg — a single number that captures how thirsty each car is.

Computing the Mean in pandas

pandas gives you the mean of any column with a single method call.

mean_mpg = cars["mpg"].mean()
print(round(mean_mpg, 2))

23.51

So the average car in this dataset gets 23.51 miles per gallon. Because this is the mean of every car we have, you could reasonably call it $\mu = 23.51$ for this population of 398 vehicles.

To prove there is no magic inside .mean(), build it from the formula yourself — the total of the column divided by its length:

n = len(cars["mpg"])
total = cars["mpg"].sum()
print(round(total / n, 2))

23.51

Identical. The .mean() method is just $\frac{1}{n}\sum x_i$ wrapped up for convenience.

The mean ignores nothing

Every value in the column contributes to the mean. That is a strength — it uses all your data — but also the source of its biggest weakness: a single very large or very small value moves the mean for everyone. The median, which you will meet next lesson, does not share that vulnerability.

The Mean as a Balance Point

Here is the idea that makes the mean special. Imagine each data value as a weight placed along a ruler. The mean is the exact point where the ruler balances — the values above it and the values below it cancel out perfectly.

A beam resting on a triangular fulcrum positioned at the mean, with data points as weights along the beam and deviation distances marked on each side. — The fulcrum sits exactly at the mean, where the positive deviations on the right and the negative deviations on the left cancel to zero.

We can show this precisely. A deviation is the distance from a value to the mean, $x_i - \bar{x}$ . If the mean is truly the balance point, then all those deviations — the ones above the mean (positive) and below it (negative) — must add up to zero.

deviations = cars["mpg"] - cars["mpg"].mean()
print(round(deviations.sum(), 6))

0.0

The deviations sum to zero. This is not a coincidence of this dataset; it is true for any set of numbers, and it follows straight from the formula:

\sum_{i=1}^{n} (x_i - \bar{x}) = \sum_{i=1}^{n} x_i - n\bar{x} = n\bar{x} - n\bar{x} = 0

Because $\bar{x} = \frac{1}{n}\sum x_i$ , the total of the values equals $n\bar{x}$ , and the deviations cancel exactly. That balancing act is what makes the mean the natural “center of mass” of your data — and, as you will see in a later lesson, it is also why those deviations have to be squared before we can measure spread, since on their own they always cancel to nothing.

When the Mean Misleads

The balance-point property is elegant, but it is also the mean’s Achilles’ heel. Because every value pulls on the balance, a few extreme values can drag the mean away from where most of the data actually sits. This happens whenever a distribution is skewed — stretched out toward one side.

The weight column is a good example. Compare its mean to its median, the middle value when the data is sorted:

print("mean  ", round(cars["weight"].mean(), 1))
print("median", cars["weight"].median())

mean   2970.4
median 2803.5

The mean weight is about 167 pounds heavier than the median. That gap is a fingerprint of skew: a tail of very heavy cars (the big V8 sedans of the era) pulls the mean upward, while the median — which only cares about the middle car, not how heavy the heaviest ones are — stays put. The figure below shows the distribution with both markers.

Histogram of car weight in pounds, right-skewed, with the mean drawn to the right of the median because a tail of heavy cars pulls the mean upward. — Car weight is right-skewed: a tail of heavy cars pulls the mean (orange) to the right of the median (green). The marker labels are rounded to the nearest pound; the exact values are 2,970.4 (mean) and 2,803.5 (median).

When data is roughly symmetric, the mean and median nearly agree and the mean is a fine summary. When data is skewed or has outliers, the mean can tell a story most of your data would not recognize. Knowing which situation you are in is exactly why the next lesson pairs the mean with the median.

An Overall Mean Is a Blend of Subgroups

There is one more thing the mean hides, and it is the most useful insight in this lesson. A single overall average is almost never the whole story, because it is silently averaging across groups that may be very different from each other.

Our cars come from three regions. Look at the average mpg within each, along with how many cars each region contributes:

by_origin = cars.groupby("origin")["mpg"].agg(["mean", "count"]).round(2)
print(by_origin)

         mean  count
origin
europe  27.89     70
japan   30.45     79
usa     20.08    249

These groups are worlds apart. Japanese cars average 30.45 mpg; American cars average just 20.08 mpg — more than ten miles per gallon thirstier. The overall mean of 23.51 sits closer to the USA figure, and now you can see why: the USA contributes 249 of the 398 cars, so it dominates the average.

In fact, the overall mean is exactly the subgroup means combined, each weighted by how many cars are in its group. Multiply each group’s mean by its count, add those up, and divide by the total count:

blend = (by_origin["mean"] * by_origin["count"]).sum() / by_origin["count"].sum()
print(round(blend, 2))

23.51

The blend reproduces the overall mean of 23.51 perfectly. This is no accident — the overall mean is a weighted average of its subgroup means, where the weights are the group sizes:

\bar{x} = \frac{\sum_g n_g \, \bar{x}_g}{\sum_g n_g}

That single equation is why the overall figure leans toward the largest group. It also points straight at the next lesson: when the things you are averaging carry different weights — different group sizes, different importance, different reliability — the plain mean is no longer the right tool, and you need the weighted mean.

Practice Exercises

Exercise 1: Mean acceleration, two ways

Compute the mean of the acceleration column using .mean(), then reproduce the same number from the raw formula (sum divided by count). Confirm the two match to two decimal places.

Hint

The method version is cars["acceleration"].mean(). The manual version is cars["acceleration"].sum() / len(cars["acceleration"]). Wrap both in round(..., 2) before comparing.

Exercise 2: Deviations always cancel

Pick the horsepower column and show that the deviations from its mean sum to (approximately) zero. The column has a few missing values, so drop them first. Why might the result print as a tiny number like 1e-13 instead of an exact 0?

Hint

Use hp = cars["horsepower"].dropna(), then (hp - hp.mean()).sum(). The tiny leftover is floating-point rounding — computers store decimals with limited precision, so the cancellation is exact in math but off by a hair in binary. Round it to see the intended 0.0.

Exercise 3: Mean mpg by model year

Group the cars by model_year and compute the mean mpg for each year. Did fuel economy improve over the 1970–82 span? Then check that the overall mean still equals the year means blended by their counts.

Hint

Use cars.groupby("model_year")["mpg"].agg(["mean", "count"]). To verify the blend, multiply each year’s mean by its count, sum that, and divide by the total count — it should land back on 23.51.

Summary

You met the most familiar statistic of all and looked past its simplicity. The mean is the sum of the values over their count, written $\bar{x}$ for a sample and $\mu$ for a population. It sits at the data’s balance point — the deviations around it always sum to zero — which makes it the natural center of a symmetric distribution. But that same property makes it sensitive to skew and outliers, which is why the mean of car weight sat well above the median. And crucially, an overall mean is a weighted blend of its subgroup means: the cars’ 23.51 mpg leaned toward the largest group, the USA, exactly as the group-size weights predict.

Key Concepts

Arithmetic mean — the sum of all values divided by the number of values, $\bar{x} = \frac{1}{n}\sum x_i$ .
Population mean $\mu$ — the mean of an entire population; the sample mean $\bar{x}$ estimates it.
Deviation — the distance from a value to the mean, $x_i - \bar{x}$ ; deviations always sum to zero.
Balance point — the mean is the value at which the data balances, its center of mass.
Skew — an asymmetric tail that pulls the mean away from the median.
Weighted blend — an overall mean equals its subgroup means weighted by their group sizes.

Why This Matters

The mean is the number you will report, read, and be misled by more than any other. Knowing that it balances the data, that a long tail can drag it somewhere unrepresentative, and that it secretly leans toward your biggest subgroup is what keeps you from reporting an “average” that no one in your data would recognize. Every dashboard headline and KPI rests on these instincts.

Next Steps

Continue to Lesson 2 - The Weighted Mean and the Median

Average values that carry different weights, and meet the median — the center that ignores outliers.

Back to Module Overview

Return to the Measures of Center & Variability module overview

Continue Building Your Skills

You now know what the mean really measures — and the three ways it can quietly mislead you, from skew to lopsided subgroups. Next you will pick up the two tools that handle exactly those cases: the weighted mean, for when your values do not all count equally, and the median, for when a few extreme values would otherwise steer the average off course.

Next lesson

Lesson 2 - The Weighted Mean and the Median

Courses

DATATWEETS

Title here

Lesson 1 - The Mean

Welcome to The Mean

What the Mean Is

Computing the Mean in pandas

The Mean as a Balance Point

When the Mean Misleads

An Overall Mean Is a Blend of Subgroups

Practice Exercises

Exercise 1: Mean acceleration, two ways

Exercise 2: Deviations always cancel

Exercise 3: Mean mpg by model year

Summary

Key Concepts

Why This Matters

Next Steps

Continue to Lesson 2 - The Weighted Mean and the Median

Back to Module Overview

Continue Building Your Skills

Lesson 1 - The Mean

Welcome to The Mean#

What the Mean Is#

Computing the Mean in pandas#

The Mean as a Balance Point#

When the Mean Misleads#

An Overall Mean Is a Blend of Subgroups#

Practice Exercises#

Exercise 1: Mean acceleration, two ways#

Exercise 2: Deviations always cancel#

Exercise 3: Mean mpg by model year#

Summary#

Key Concepts#

Why This Matters#

Next Steps#

Continue to Lesson 2 - The Weighted Mean and the Median

Back to Module Overview

Continue Building Your Skills#

Welcome to The Mean

What the Mean Is

Computing the Mean in pandas

The Mean as a Balance Point

When the Mean Misleads

An Overall Mean Is a Blend of Subgroups

Practice Exercises

Exercise 1: Mean acceleration, two ways

Exercise 2: Deviations always cancel

Exercise 3: Mean mpg by model year

Summary

Key Concepts

Why This Matters

Next Steps

Continue Building Your Skills