Lesson 3 - The Randomization Unit

Welcome to The Randomization Unit

Lumen is ready to split traffic: half of visitors see the new signup page, half see the old one. But “half of what?” is a real question, and the answer isn’t obvious. Half the users? Half the visits? Half the study groups? These sound interchangeable and they are not — each one flips the coin on a different thing, and the wrong choice can quietly break the experiment before a single number comes in. The thing you flip the coin on is called the randomization unit, and choosing it is a design decision every bit as important as the hypothesis. This lesson is about choosing it well.

By the end of this lesson, you will be able to:

  • Explain what a randomization unit is and why the choice matters
  • Compare randomizing by user, by session, and by cluster
  • Name the two properties a good randomization unit must satisfy
  • Assign users deterministically so each one gets a consistent experience across visits

Let’s start with the three units you’ll actually choose between.


Three Ways to Flip the Coin

The randomization unit is simply what you assign to treatment or control. Three units cover almost every experiment you’ll run:

  • By user (the default). Each user is assigned once, and every visit they make lands in the same group. A user who sees the new signup page today sees it again next week. This is what you want for almost any product change: it gives each person one coherent experience, and their repeat visits don’t get scattered across both groups.
  • By session. Each visit is assigned independently — the same person can be in treatment this morning and control tonight. It’s occasionally simpler to implement if your system doesn’t have stable user IDs, but it means one user can see both versions, and their visits are correlated with each other rather than independent. Reach for it only when a within-visit change genuinely can’t span sessions.
  • By cluster. Whole groups are assigned together — a classroom, a city, a company, a team, a social graph. You use this when the people in a group interact or when the treatment can spill over from one person to another. If Lumen tests a feature that students use together in a study group, assigning individuals would let the treatment leak from a treated student to a control classmate, contaminating the comparison. Assigning the whole group keeps treatment and control cleanly separated.
Three options for the randomization unit. By user (the default): a user is assigned once and gets a consistent experience across all their sessions. By session: each visit is assigned independently, so the same user may see both versions and risk an inconsistent experience with correlated observations. By cluster: whole groups such as classrooms, cities, or teams are assigned together to avoid interference and spillover between interacting users. The unit must give each subject a consistent experience and keep the two groups independent of each other.
What you flip the coin on decides the experiment's shape: by user for a consistent per-person experience, by session only when a within-visit change can't span visits, by cluster when users interact and treatment could spill over.

The Two Properties a Good Unit Must Satisfy

Underneath those three choices are two properties the unit has to satisfy, and every failure mode is one of them breaking:

  • Consistency — each subject gets one coherent experience. A user who sees the new page, then the old page, then the new page again isn’t experiencing “the treatment”; they’re experiencing a flickering mess, and whatever they do can’t be attributed to either version. Randomizing by user gives consistency for free — the same person always lands in the same group. Randomizing by session throws it away, which is exactly why by-session is risky for anything a user will encounter more than once.
  • Independence — the two groups don’t influence each other. The whole comparison assumes that what happens in treatment says nothing about what happens in control except through the change itself. When users interact — share, invite, collaborate, compete — treatment can leak into control (called interference or spillover), and the groups stop being independent. A UI tweak on Lumen’s signup page has no spillover, so by-user is fine. But a feature where students collaborate in study groups does: assign individuals and a treated student pulls their control groupmates along. Randomizing by the cluster — the whole study group — restores independence.

So the rule of thumb: by user unless something breaks one of these. If users interact and treatment can spill, move up to by cluster. If a change genuinely can’t persist across visits, only then consider by session — and know what consistency you’re giving up.


Making Assignment Consistent in Code

Here’s the mechanism that makes “by user” actually consistent. Instead of re-flipping a coin on every visit, you hash the user ID and derive the group from the hash. The same ID always hashes to the same value, so the same user always lands in the same group — no need to store the assignment anywhere:

import hashlib

def assign(user_id, salt="signup_test"):
    h = hashlib.md5(f"{salt}:{user_id}".encode()).hexdigest()
    return "treatment" if int(h, 16) % 2 == 0 else "control"

for uid in ["u-1001", "u-1002", "u-1003"]:
    print(uid, assign(uid), assign(uid))   # same result both times -> consistent

Each ID prints the same group on both calls — that’s the point. Because the hash of a given ID never changes, a user’s assignment is deterministic: stable across every visit, so their experience stays consistent. Yet across different users the hash spreads out like a fair coin, so the split is still effectively random. The salt matters too: change it and the same users get reshuffled independently, which lets Lumen run several experiments at once without one experiment’s split leaking into another’s.

Deterministic isn’t the same as biased

“The same user always gets the same group” can sound like it defeats the randomness — it doesn’t. Randomization applies across subjects, not within one: over thousands of user IDs the hash sends roughly half to each group with no pattern tied to anything about the user. Determinism only fixes each individual’s group so their experience is consistent from visit to visit. Storing assignments in a database would achieve the same consistency, but hashing gives it to you with no lookup and no state — the same ID recomputes the same group anywhere, anytime.


Why the Unit Drives Cost

One more consequence, because it sets up the next lesson. Statistics cares about the number of independent units, not the number of people. When you randomize by user, each user is an independent unit. When you randomize by cluster, the whole cluster is one unit — the students inside a study group are correlated, so a 30-person group counts as far less than 30 independent observations. That means a clustered experiment needs more total users to reach the same statistical power as a user-level one. Choosing a coarser unit buys you independence in the presence of spillover, but you pay for it in sample size — which is exactly the trade-off the next lesson starts to quantify.


Practice Exercises

Exercise 1: Pick the unit

Lumen wants to test a new color for the “Sign up” button. Users never interact through this button, and the change should look the same every time a person sees it. What randomization unit should you use, and why?

Hint

By user. There’s no interaction between users, so no spillover — independence is already satisfied. The remaining concern is consistency: a person shouldn’t see a blue button one visit and a green one the next, which is exactly what by-session would risk. Assigning by user gives each person one stable color across every visit, which is what you want for a UI change.

Exercise 2: Spot the spillover

Lumen launches a referral feature: treated users can invite friends to join a shared study group. The team plans to randomize by individual user. What goes wrong?

Hint

Interference. A treated user invites control users into their group, so the treatment’s effect leaks into the control group — the two groups are no longer independent, and any measured difference is muddied by the leakage. The fix is to randomize by cluster: assign whole study groups (or whole social circles) to treatment or control together, so the invitation stays inside one group and the comparison stays clean.

Exercise 3: Why hash the ID?

A teammate proposes assigning each user by calling random.random() fresh on every page load: below 0.5 is control, above is treatment. What breaks, and how does hashing the user ID fix it?

Hint

Calling random.random() each load re-flips the coin every visit, so the same user bounces between treatment and control — consistency is destroyed and their behavior can’t be attributed to either version. Hashing the user ID makes the assignment deterministic: the same ID always hashes to the same value, so the user lands in the same group on every visit with no stored state. It’s still random across users, but fixed for each one — consistency and randomness at the same time.


Summary

The randomization unit is what you flip the coin on, and choosing it is a design decision as consequential as the hypothesis. There are three common choices: by user (the default — each person gets one consistent experience across all visits), by session (each visit assigned independently, which lets one user see both versions and correlates their observations, so use it sparingly), and by cluster (whole groups assigned together, for when users interact and treatment can spill over). Underneath, the unit must satisfy two properties: consistency (each subject gets one coherent experience) and independence (the two groups don’t influence each other). A deterministic hash of the user ID gives each user a stable group across every visit while staying random across users — and choosing a coarser unit like a cluster buys independence at the cost of needing more total users for the same power.

Key Concepts

  • Randomization unit — the thing assigned to treatment or control: user, session, or cluster.
  • Consistency — each subject gets one coherent experience; by-user gives it, by-session breaks it.
  • Independence / spillover — groups mustn’t influence each other; cluster randomization prevents interference when users interact.
  • Deterministic assignment — hashing the user ID fixes each person’s group across visits while staying random across users.

Why This Matters

The randomization unit is one of the quietest ways an experiment goes wrong: pick by-session for a change users see repeatedly and you get a flickering experience nobody’s behavior can be attributed to; randomize individuals when they interact and treatment leaks into control until the comparison means nothing. Both failures are invisible in the results — the numbers still come out, they’re just uninterpretable. Getting the unit right up front, and making the assignment deterministic so it actually stays consistent, is what keeps the two groups comparable. Next, you’ll put a number on the threshold you named in your hypothesis: the minimum detectable effect, the smallest change worth caring about — and the first driver of how much data you’ll need.


Next Steps

Continue to Lesson 4 - The Minimum Detectable Effect

Put a number on the smallest change worth detecting, and see how it drives the sample size your experiment needs.

Back to Module Overview

Return to the Designing an Experiment module overview


Continue Building Your Skills

You can now choose what to flip the coin on — by user for a consistent per-person experience, by cluster when users interact and treatment could spill over, by session only when a change can’t span visits — and make each assignment deterministic so a user stays in the same group every visit. Next you’ll turn the threshold from your hypothesis into a concrete minimum detectable effect: the smallest change worth detecting, which is the first thing that decides how much data your experiment needs.