Lesson 5 - Z-scores

Welcome to Z-scores

In the last lesson you learned that the standard deviation measures how far a typical value sits from the mean. But “7.82 miles per gallon away from average” is awkward to reason about, and it is useless the moment you want to compare it to a weight measured in pounds. The z-score fixes both problems with one small idea: instead of measuring distance in the original units, measure it in standard deviations.

A car that gets 39 mpg and a car that weighs 1,300 pounds are described in totally different units, yet a z-score can tell you that one is more unusual for its kind than the other. That single comparison — across scales — is what makes the z-score one of the most reused tools in all of data work.

By the end of this lesson, you will be able to:

  • Standardize a value with the z-score formula and explain what it measures
  • Compute z-scores for a whole column in pandas
  • Read a z-score’s sign and magnitude to judge how unusual a value is
  • Use z-scores to compare values measured on different scales, and check the empirical rule against real data

You only need pandas and the cars dataset from this module. Let’s begin.


The Standardization Idea

Picture two questions about the same car: Is 46.6 mpg a lot? and Is 2,110 pounds a lot? You cannot answer either one in isolation — “a lot” only means something relative to the rest of the data. Standardization rephrases every value as the same question: how far is this from average, measured in standard deviations?

Two aligned horizontal axes — a raw scale marked from mean minus two sigma to plus two sigma above a z scale from minus two to two — with a vertical connector mapping one raw value to its z-score.
Standardizing maps each raw value onto a z scale that recenters the data at 0 and rescales it into standard-deviation units.

That rephrasing is the z-score. For a value x x drawn from data with mean μ \mu and standard deviation σ \sigma , the z-score is:

z=xμσ z = \frac{x - \mu}{\sigma}

The numerator xμ x - \mu is how far the value is from the mean in the original units. Dividing by σ \sigma converts that distance into a count of standard deviations. A z-score of +2 +2 means “two standard deviations above the mean”; a z-score of 0.5 -0.5 means “half a standard deviation below it.” The units cancel, which is exactly why z-scores from different columns can be compared directly.

Let’s load the data and pin down the two numbers the formula needs.

import pandas as pd

cars = pd.read_csv("https://datatweets.com/datasets/cars.csv")

mu = cars["mpg"].mean()
sd = cars["mpg"].std()          # ddof=1 (sample standard deviation), the pandas default
print(round(mu, 2), round(sd, 2))
23.51 7.82

So fuel economy in this dataset has a mean of 23.51 mpg and a standard deviation of 7.82 mpg. Throughout this lesson we use pandas’ default .std(), which divides by n1 n - 1 (the sample standard deviation, ddof=1) — keep that consistent so your z-scores stay comparable.


Computing Z-scores in pandas

Because pandas operations work on whole columns at once, you can standardize every car in a single expression — subtract the mean, divide by the standard deviation, and pandas applies it row by row.

cars["mpg_z"] = (cars["mpg"] - mu) / sd
print(cars[["name", "mpg", "mpg_z"]].head(5).round(2).to_string(index=False))
                     name  mpg  mpg_z
chevrolet chevelle malibu 18.0  -0.71
        buick skylark 320 15.0  -1.09
       plymouth satellite 18.0  -0.71
            amc rebel sst 16.0  -0.96
              ford torino 17.0  -0.83

Every car now carries a z-score. The first row, a Chevrolet getting 18 mpg, scores 0.71 -0.71 : its fuel economy sits about three-quarters of a standard deviation below average. The Buick at 15 mpg is further below, at 1.09 -1.09 . None of these early-1970s sedans is fuel-efficient, and the z-scores say so with a sign and a size.

The mean has a z-score of zero

A value exactly equal to the mean standardizes to z=0 z = 0 , because the numerator xμ x - \mu becomes zero. Standardizing a whole column always produces a new column with a mean of 0 and a standard deviation of 1 — that is the entire point. The shape of the distribution does not change; only its scale and center do.


Reading Sign and Magnitude

A z-score packs two pieces of information into one number, and you read them separately.

  • The sign tells you the direction. Positive means above the mean; negative means below it.
  • The magnitude tells you how far. A z-score near 0 is unremarkable; one past ±2 \pm 2 is starting to look unusual; one past ±3 \pm 3 is rare.

This is what makes z-scores a natural tool for finding unusual values — values that stand far from the crowd. Sort by the z-score and the extremes rise to the top:

top = cars.sort_values("mpg_z", ascending=False).head(3)
print(top[["name", "mpg", "mpg_z"]].round(2).to_string(index=False))
                name  mpg  mpg_z
           mazda glc 46.6   2.95
 honda civic 1500 gl 44.6   2.70
vw rabbit c (diesel) 44.3   2.66

The most fuel-efficient car in the dataset is the mazda glc at 46.6 mpg, with a z-score of +2.95 +2.95 — almost three standard deviations above average. That is genuinely exceptional: in any bell-shaped distribution, values beyond +2.95 +2.95 are very rare. The two runners-up, a Honda and a diesel VW, sit just behind it. The z-score did not just rank these cars; it quantified how extreme they are in a way the raw mpg never could.

The figure below shows where these standard-deviation bands fall across the whole mpg distribution, with the mazda glc marked at the far right.

Histogram of car fuel economy in miles per gallon, with a vertical line at the mean of 23.5 and dashed lines marking one, two, and three standard deviations on each side. The mazda glc is marked at 46.6 mpg with a z-score of plus 2.95, and labels report that 63.1 percent of cars fall within one standard deviation, 97.5 percent within two, and 100 percent within three.
Fuel economy with standard-deviation bands. The mean is 23.5 mpg; each dashed line is one standard deviation (7.82 mpg) further out. The mazda glc sits almost three standard deviations above the mean, which is why its z-score is +2.95.

Comparing Across Different Scales

Here is where the z-score earns its keep. Fuel economy is measured in miles per gallon; weight is measured in pounds. You cannot compare 46.6 mpg to 2,110 pounds directly — the units make it meaningless. But once both are standardized, they live on the same scale, and the comparison becomes obvious.

Standardize weight the same way, then look at the mazda glc on both measures at once.

mu_w = cars["weight"].mean()
sd_w = cars["weight"].std()
cars["weight_z"] = (cars["weight"] - mu_w) / sd_w

car = cars[cars["name"] == "mazda glc"]
print(car[["name", "mpg", "mpg_z", "weight", "weight_z"]].round(2).to_string(index=False))
     name  mpg  mpg_z  weight  weight_z
mazda glc 46.6   2.95    2110     -1.02

Now you can read both standings off one row. The mazda glc is +2.95 +2.95 on mpg — extraordinarily fuel-efficient — and 1.02 -1.02 on weight — about one standard deviation lighter than average. Comparing the magnitudes, its fuel economy is far more exceptional than its lightness: 2.95 2.95 standard deviations versus 1.02 1.02 . The car is unusually light, but it is extraordinarily economical. Without z-scores you would be stuck comparing “46.6 mpg” to “2,110 pounds” and unable to say which standing is more remarkable.

Why this matters beyond cars

This same trick — putting every feature on a common scale by subtracting the mean and dividing by the standard deviation — is the standardization step run before many machine-learning algorithms (k-nearest neighbors, logistic regression, k-means, and others). Those methods measure distances between rows, and a feature with large raw units would otherwise dominate simply because its numbers are bigger. Standardizing levels the playing field. You will meet it again as StandardScaler, but it is exactly the z-score you just computed.


The Empirical Rule

For data shaped like a bell curve, z-scores follow a famous pattern called the empirical rule, also known as the 68–95–99.7 rule. It says that roughly:

  • 68% of values fall within 1 standard deviation of the mean (1<z<1 -1 < z < 1 )
  • 95% fall within 2 standard deviations (2<z<2 -2 < z < 2 )
  • 99.7% fall within 3 standard deviations (3<z<3 -3 < z < 3 )
A bell curve with nested shaded bands showing 68% of values within one standard deviation, 95% within two, and 99.7% within three, with sigma marks along the horizontal axis.
For a bell-shaped distribution, about 68%, 95%, and 99.7% of values fall within 1, 2, and 3 standard deviations of the mean.

This is a powerful shortcut: if you know a distribution is roughly bell-shaped, a single z-score tells you immediately how common or rare a value is. But the rule is exact only for a perfect bell curve, and real data is rarely perfect. Let’s test it on the actual mpg column.

for k in [1, 2, 3]:
    pct = (cars["mpg_z"].abs() <= k).mean() * 100
    print(k, round(pct, 1))
1 63.1
2 97.5
3 100.0

The results are close to the empirical rule but not identical. 63.1% of cars fall within one standard deviation, where the rule predicts about 68%. Within two standard deviations we find 97.5% (rule: ~95%), and within three, 100% (rule: ~99.7%). The fit is good enough to be useful, but the gaps are real — and they tell you something.

The mpg distribution is not a symmetric bell. It is right-skewed: most cars cluster at modest fuel economy, with a long tail of a few very efficient outliers like the mazda glc stretching to the right. That skew is why slightly fewer values land within one standard deviation than the rule predicts, while almost everything fits inside two. The empirical rule is a guide for bell-shaped data, not a law — always check it against your actual distribution before leaning on it.

The empirical rule assumes a bell shape

The 68–95–99.7 figures only hold when the data is approximately normal (symmetric and bell-shaped). For skewed data — incomes, house prices, fuel economy — the percentages drift, as you just saw. Use the rule as a quick sanity check, but compute the real fractions when the answer matters.


Practice Exercises

Exercise 1: Standardize horsepower

Compute z-scores for the horsepower column, then find the car with the highest horsepower z-score. How many standard deviations above the mean is it, and does that match your intuition for the most powerful car in the dataset?

Hint

Build the column with cars["hp_z"] = (cars["horsepower"] - cars["horsepower"].mean()) / cars["horsepower"].std(), then cars.sort_values("hp_z", ascending=False).head(1). The horsepower column has no missing values here, so you can standardize it directly.

Exercise 2: Find a balanced car

Use the mpg_z and weight_z columns to find cars whose fuel economy and weight are both close to average — that is, both z-scores between 0.25 -0.25 and +0.25 +0.25 . How many cars qualify, and what does it mean for a car to be “average” on both scales at once?

Hint

Filter with a combined condition: cars[(cars["mpg_z"].abs() < 0.25) & (cars["weight_z"].abs() < 0.25)]. Count the result with len(...). These are the unremarkable middle-of-the-pack cars on both measures.

Exercise 3: Test the empirical rule on weight

Repeat the empirical-rule check from this lesson, but on weight_z instead of mpg_z. Compute the percentage of cars within 1, 2, and 3 standard deviations and compare them to 68/95/99.7. Is weight closer to or further from a bell shape than mpg?

Hint

Reuse the loop pattern: for k in [1, 2, 3]: print(k, round((cars["weight_z"].abs() <= k).mean() * 100, 1)). Compare each percentage to the empirical-rule target for that band.


Summary

You learned to standardize a value into a z-score — the number of standard deviations it sits from the mean, computed as z=(xμ)/σ z = (x - \mu) / \sigma . The sign tells you direction and the magnitude tells you how unusual the value is, which makes z-scores a clean way to flag outliers like the mazda glc at z=+2.95 z = +2.95 . Because standardization strips away units, z-scores from different columns are directly comparable, letting you weigh a car’s mpg standing against its weight standing on one scale. Finally, you tested the empirical rule against the real mpg data and saw it hold approximately — 63.1% within one standard deviation rather than the textbook 68% — a reminder that the rule assumes a bell shape the data only roughly matches.

Key Concepts

  • Standardization — rescaling a value to its distance from the mean, measured in standard deviations.
  • Z-scorez=(xμ)/σ z = (x - \mu) / \sigma ; how many standard deviations a value lies above (+ + ) or below ( - ) the mean.
  • Magnitude vs. sign — sign gives direction; magnitude gives how unusual. Beyond ±2 \pm 2 is uncommon; beyond ±3 \pm 3 is rare.
  • Empirical rule — for bell-shaped data, ~68% / ~95% / ~99.7% of values fall within 1 / 2 / 3 standard deviations.
  • Skew — asymmetry in a distribution that makes the empirical rule’s percentages drift, as with right-skewed mpg.

Why This Matters

Z-scores turn “is this big?” into a question every column can answer the same way, which is why they sit at the heart of outlier detection, A/B test statistics, and the feature scaling that precedes most machine-learning models. Once you can standardize, you can compare anything to anything — and you can tell at a glance whether a value is ordinary or extraordinary for its kind.


Next Steps

Continue to Lesson 6 - Guided Project: Profiling Fuel Economy

Put the whole module together in a hands-on project that profiles the cars dataset end to end.

Back to Module Overview

Return to the Measures of Center & Variability module overview


Continue Building Your Skills

You now have every measure of center and spread this module set out to teach, and in the z-score you have the tool that ties them together — a single, unit-free way to say how unusual any value is. Next you will bring all of it to bear in a guided project, profiling the cars dataset from its center to its tails and deciding, for yourself, which numbers actually tell the story.