Lesson 2 - Probability Rules

Welcome to Probability Rules

In the last lesson you learned to put a number on chance — by counting how often something happens in data, and by counting equally likely outcomes in theory. That gives you the probability of a single event. But real questions rarely stop there. You want the chance of not something, or of one thing or another. To answer those, you need a small set of rules that combine probabilities reliably.

In this lesson you will give chance a precise vocabulary — sample spaces and events — see the handful of axioms every probability must obey, and then learn the two rules you will reach for constantly: the complement rule for “not,” and the addition rule for “or.” You will check every rule against real penguin data and against a fair die.

By the end of this lesson, you will be able to:

  • Describe a random experiment with a sample space and define events inside it
  • State the probability axioms and use them to sanity-check any probability
  • Apply the complement rule to find the chance an event does not happen
  • Apply the addition rule for both mutually exclusive and overlapping events

You only need a little Python and pandas. Let’s begin.


Sample Spaces and Events

Probability starts with a random experiment — any process whose outcome you cannot predict with certainty, like rolling a die or drawing a penguin at random from the dataset. The sample space, written S S , is the set of all possible outcomes of that experiment. For a single roll of a six-sided die, the sample space is:

S={1,2,3,4,5,6} S = \{1, 2, 3, 4, 5, 6\}

An event is any subset of the sample space — a collection of outcomes you care about. “Rolling an even number” is the event {2,4,6} \{2, 4, 6\} . “Rolling a 5” is the single-outcome event {5} \{5\} . An event happens when the experiment produces any outcome inside it.

The same vocabulary fits the penguins. Picture the experiment as “pick one of the 344 penguins at random.” The sample space is all 344 birds, and “the penguin is a Gentoo” is an event — the subset of 124 Gentoo penguins. Load the data and list the outcomes for the species experiment:

import pandas as pd

penguins = pd.read_csv("https://datatweets.com/datasets/penguins.csv")
print(penguins["species"].unique())
['Adelie' 'Gentoo' 'Chinstrap']

So if the experiment is “read off the species of a randomly chosen penguin,” the sample space is S={Adelie,Gentoo,Chinstrap} S = \{\text{Adelie}, \text{Gentoo}, \text{Chinstrap}\} , and each species is an event inside it.


The Probability Axioms

Every probability — no matter how you computed it — obeys three rules called the axioms. They are not deep theorems; they are the minimum any sensible measure of chance must satisfy, and they make excellent sanity checks.

  1. Probabilities live between 0 and 1. For any event A A , 0P(A)1 0 \le P(A) \le 1 . A probability of 0 means impossible; 1 means certain. A “probability” of 1.4 or −0.2 is a bug, not a result.
  2. The whole sample space has probability 1. P(S)=1 P(S) = 1 . Something in the sample space must happen.
  3. The probabilities of all distinct outcomes sum to 1. Add up the chance of every outcome that cannot occur together with the others, and you get exactly 1.

The species experiment shows axioms 1 and 3 directly. Each species probability sits between 0 and 1, and because a penguin is exactly one species, the three add to 1:

probs = penguins["species"].value_counts(normalize=True).round(4)
print(probs)
print("sum:", probs.sum())
species
Adelie       0.4419
Gentoo       0.3605
Chinstrap    0.1977
Name: proportion, dtype: float64
sum: 1.0

Each value is a valid probability, and they total exactly 1.0. That total is your first reflex check: if the probabilities of a complete set of separate outcomes do not add to 1, something is wrong — you have double-counted, missed an outcome, or made an arithmetic slip.

Why the axioms matter

The axioms turn “probability” from a vague feeling into a system you can compute with. The rules in the rest of this lesson — complement and addition — are not new assumptions. They are direct consequences of these three axioms, which is why they always hold.


The Complement Rule

The complement of an event A A , written Ac A^c , is everything in the sample space except A A — the event “A does not happen.” Because A A and Ac A^c together cover the whole sample space and never overlap, their probabilities must add to 1 (axiom 3). Rearranging gives the complement rule:

P(Ac)=1P(A) P(A^c) = 1 - P(A)

This is one of the most useful shortcuts in probability. Whenever the chance of “not happening” is easier to find than the chance of “happening” — or vice versa — you flip between them by subtracting from 1.

A rectangle labelled S representing the whole sample space. A shaded blue circle inside it is labelled A, and the surrounding region is labelled A-complement (not A). Above the box the equation P(A) plus P(A-complement) equals 1.
The whole sample space splits cleanly into the event A and its complement, everything else, so their probabilities must add to exactly 1.

Take the event “the penguin is Adelie.” Its probability is 0.4419, so the chance a randomly chosen penguin is not Adelie is:

P(not Adelie)=1P(Adelie)=10.4419=0.5581 P(\text{not Adelie}) = 1 - P(\text{Adelie}) = 1 - 0.4419 = 0.5581

Confirm it two ways — once with the rule, once by counting the non-Adelie penguins directly:

p_adelie = (penguins["species"] == "Adelie").mean()
print("P(Adelie):    ", round(p_adelie, 4))
print("rule 1 - P(A):", round(1 - p_adelie, 4))
print("direct count: ", round((penguins["species"] != "Adelie").mean(), 4))
P(Adelie):     0.4419
rule 1 - P(A):  0.5581
direct count:  0.5581

Both routes give 0.5581. The rule and the brute-force count agree, exactly as they must — “not Adelie” is just the Gentoo and Chinstrap penguins combined, and 0.3605+0.1977=0.5582 0.3605 + 0.1977 = 0.5582 up to rounding.


The Addition Rule

The addition rule answers “or” questions: what is the chance that event A A happens or event B B happens (or both)? In set language this is the union, AB A \cup B . The form of the rule depends on whether the two events can happen at the same time.

Two Venn diagrams. On the left, separate circles A and B that do not touch, labelled P(A union B) = P(A) + P(B). On the right, circles A and B that overlap, with the shared lens shaded green and labelled A intersection B, and the formula P(A union B) = P(A) + P(B) minus P(A intersection B).
When events are mutually exclusive (left) you just add their probabilities; when they overlap (right) you must subtract the shaded intersection so the shared region is not counted twice.

Mutually exclusive events

Two events are mutually exclusive (or disjoint) when they cannot both happen on the same trial — they share no outcomes. A single penguin cannot be both Adelie and Gentoo, so those two events are mutually exclusive. For such events the addition rule is simple: just add the probabilities.

P(AB)=P(A)+P(B) P(A \cup B) = P(A) + P(B)

So the chance a random penguin is Adelie or Gentoo is:

P(AdelieGentoo)=0.4419+0.3605=0.8023 P(\text{Adelie} \cup \text{Gentoo}) = 0.4419 + 0.3605 = 0.8023

Verify it against the data — add the two probabilities, then check directly with a boolean mask that flags either species:

p_adelie = (penguins["species"] == "Adelie").mean()
p_gentoo = (penguins["species"] == "Gentoo").mean()

either = (penguins["species"] == "Adelie") | (penguins["species"] == "Gentoo")
print("P(A) + P(G):", round(p_adelie + p_gentoo, 4))
print("direct |:  ", round(either.mean(), 4))
P(A) + P(G): 0.8023
direct |:   0.8023

Both give 0.8023. Adding worked here precisely because no penguin was counted twice — there is no penguin that is both Adelie and Gentoo, so nothing got double-counted.

When events overlap: the general rule

Now watch what goes wrong when events can happen together. Switch from penguins to a fair six-sided die, where the sample space is {1,2,3,4,5,6} \{1,2,3,4,5,6\} and every outcome has probability 1/6 1/6 . Consider two events:

  • E E = “the roll is even” = {2,4,6} \{2, 4, 6\} , so P(E)=3/6=0.5 P(E) = 3/6 = 0.5
  • G G = “the roll is greater than 3” = {4,5,6} \{4, 5, 6\} , so P(G)=3/6=0.5 P(G) = 3/6 = 0.5

These are not mutually exclusive: the outcomes 4 and 6 are both even and greater than 3. If you naively add, you get 0.5+0.5=1.0 0.5 + 0.5 = 1.0 — which claims it is certain the roll is even or above 3. But rolling a 1 or a 3 satisfies neither. The naive sum is wrong because outcomes 4 and 6 were counted twice, once in each event.

The fix is the general addition rule: add the two probabilities, then subtract the probability of the intersection AB A \cap B — the outcomes the events share — so the overlap is counted only once.

P(AB)=P(A)+P(B)P(AB) P(A \cup B) = P(A) + P(B) - P(A \cap B)

Here the overlap is {4,6} \{4, 6\} , so P(EG)=2/6 P(E \cap G) = 2/6 . Enumerate the whole sample space to confirm:

outcomes = [1, 2, 3, 4, 5, 6]
even   = [x for x in outcomes if x % 2 == 0]   # {2, 4, 6}
big    = [x for x in outcomes if x > 3]         # {4, 5, 6}
overlap = [x for x in outcomes if x % 2 == 0 and x > 3]   # {4, 6}

p_even = len(even) / 6
p_big  = len(big) / 6
p_overlap = len(overlap) / 6

print("naive P(E)+P(G):     ", round(p_even + p_big, 4))
print("general rule:        ", round(p_even + p_big - p_overlap, 4))
union = sorted(set(even) | set(big))
print("direct union", union, "->", round(len(union) / 6, 4))
naive P(E)+P(G):      1.0
general rule:         0.6667
direct union [2, 4, 5, 6] -> 0.6667

The naive sum claimed 1.0; the general rule gives 0.6667, which matches the direct count of the union {2,4,5,6} \{2, 4, 5, 6\} — four outcomes out of six. Subtracting the overlap fixed the double-count.

The general rule is the one to remember, because it always works. Mutually exclusive events are simply the special case where P(AB)=0 P(A \cap B) = 0 , so the subtraction term disappears and you are back to plain addition. The penguin species satisfy exactly that: a single penguin cannot be two species, so the intersection is empty and adding alone is correct.

Two Venn-style diagrams. On the left, two separate circles labelled Adelie and Gentoo with no overlap, marked mutually exclusive. On the right, two overlapping circles for even and greater-than-three outcomes sharing the region {4,6}.
Mutually exclusive events (left) share no outcomes, so you simply add their probabilities. Overlapping events (right) share outcomes — here {4, 6} — so the general addition rule subtracts that shared region to avoid counting it twice.

Practice Exercises

Exercise 1: A complement on the die

A fair six-sided die is rolled. Let A A be the event “the roll is greater than 3.” Find P(A) P(A) , then use the complement rule to find the probability the roll is not greater than 3, and confirm it by counting the outcomes {1,2,3} \{1, 2, 3\} directly.

Hint

P(A)=3/6=0.5 P(A) = 3/6 = 0.5 . The complement rule gives P(Ac)=10.5=0.5 P(A^c) = 1 - 0.5 = 0.5 . Check it with len([x for x in [1,2,3,4,5,6] if x <= 3]) / 6.

Exercise 2: Mutually exclusive penguins

Using the penguins dataset, find the probability that a randomly chosen penguin is Gentoo or Chinstrap. These species are mutually exclusive, so add their probabilities — then verify with a boolean mask that flags either species.

Hint

Compute (penguins["species"] == "Gentoo").mean() and (penguins["species"] == "Chinstrap").mean(), add them, then check against ((penguins["species"] == "Gentoo") | (penguins["species"] == "Chinstrap")).mean(). It should also equal 1 - P(Adelie).

Exercise 3: An overlap on the die

A fair die is rolled. Let A A be “the roll is even” and B B be “the roll is less than 4.” Use the general addition rule to find P(AB) P(A \cup B) . First decide which outcomes the two events share, then subtract that overlap.

Hint

A={2,4,6} A = \{2,4,6\} and B={1,2,3} B = \{1,2,3\} share only the outcome 2, so P(AB)=1/6 P(A \cap B) = 1/6 . Then P(AB)=36+3616 P(A \cup B) = \tfrac{3}{6} + \tfrac{3}{6} - \tfrac{1}{6} . Confirm by listing the union with sorted(set([2,4,6]) | set([1,2,3])).


Summary

You gave probability a precise vocabulary and the rules that make it usable. A random experiment has a sample space of all possible outcomes, and an event is any subset of it. Every probability obeys the axioms: each one lies between 0 and 1, and the probabilities of a complete set of separate outcomes sum to 1. From those axioms come the two rules you will use most: the complement rule, P(Ac)=1P(A) P(A^c) = 1 - P(A) , for “not”; and the addition rule for “or.” For mutually exclusive events you simply add, while the general addition rule P(AB)=P(A)+P(B)P(AB) P(A \cup B) = P(A) + P(B) - P(A \cap B) subtracts the overlap so shared outcomes are not counted twice.

Key Concepts

  • Sample space (S S ) — the set of all possible outcomes of a random experiment.
  • Event — any subset of the sample space; it occurs if any of its outcomes does.
  • Probability axioms — every probability is in [0,1] [0, 1] , and the probabilities of all distinct outcomes sum to 1.
  • Complement ruleP(Ac)=1P(A) P(A^c) = 1 - P(A) , the chance an event does not happen.
  • Mutually exclusive — two events that share no outcomes, so P(AB)=0 P(A \cap B) = 0 and probabilities simply add.
  • General addition ruleP(AB)=P(A)+P(B)P(AB) P(A \cup B) = P(A) + P(B) - P(A \cap B) , subtracting the shared outcomes.

Why This Matters

Almost every probability question you will ever face is built from “not” and “or,” and getting them right hinges on one habit: noticing whether events can happen together. Forgetting to subtract an overlap is the single most common probability mistake in real analysis — from estimating the chance a user clicks either of two buttons to combining failure modes in a reliability model. The addition and complement rules give you a dependable way to combine chances without quietly double-counting.


Next Steps

Continue to Lesson 3 - Solving Complex Probability Problems

Chain the rules together to break multi-step probability questions into pieces you can actually solve.

Back to Module Overview

Return to the Probability Fundamentals module overview


Continue Building Your Skills

You can now combine events with confidence — flipping to a complement when it is easier, and adding “or” probabilities without double-counting the overlap. Next you will stack these rules together to take apart probability problems that look intimidating at first glance but dissolve once you break them into the simple pieces you already know.