Lesson 4 - Orchestrator-Workers

Welcome to Orchestrator-Workers

Routing, in the last lesson, made one choice: read the request, pick the single specialist who should handle it, and step aside. That’s perfect when a request belongs to one category. But some goals aren’t one job — they’re several, all needed at once. “Plan a 3-day trip” really means research the place, and estimate the budget, and draft the itinerary — three distinct pieces that together make one answer. Routing can’t help here; there’s no single right specialist. This is where orchestrator-workers fits: an orchestrator decomposes the goal into subtasks, runs a worker on each, and a final synthesis step merges all the worker outputs into one coherent result. Where routing runs one agent, orchestrator-workers runs many and combines them.

By the end of this lesson, you will be able to:

  • Explain the three phases of orchestrator-workers: decompose, run workers, synthesize
  • Build an orchestrate function that plans subtasks, runs a worker per subtask, and merges the results
  • Explain why independent workers can run in parallel, and why that’s safe
  • Recognize when a job is sequential plan-then-execute instead, and reach for that pattern

Let’s decompose a goal and put a worker on each piece.


Decompose, Work, Synthesize

The pattern has exactly three phases, and each is an ordinary model call:

  1. Plan — one call asks the orchestrator to break the goal into a small number of worker subtasks, one per line. Parse that into a subtasks list.
  2. Work — loop over the subtasks; for each, one worker call does that piece and reports briefly. Collect (subtask, result) pairs.
  3. Synthesize — one final call takes all the worker results and combines them into a single plan.
A Goal enters an Orchestrator that splits it into subtasks; three independent Workers — research, budget, itinerary — fan out and run; their results gather into a Synthesize step that merges them into one plan. A note marks that the workers are independent and can run in parallel.
An orchestrator decomposes a goal into independent subtasks, a worker runs on each, and a synthesis step merges their outputs into one plan.

Here is the whole pattern. It’s built on plain client.messages.create calls — no new engine, just the orchestrator deciding what the subtasks are and the synthesis step deciding how they fit together:

def orchestrate(client, goal, *, system, model="claude-haiku-4-5"):
    plan = client.messages.create(model=model, max_tokens=256, system=system,
        messages=[{"role":"user","content":f"Break this goal into 2-4 worker subtasks, "
                                            f"one per line:\n{goal}"}])
    subtasks = [l.strip("-• ").strip()
                for l in "".join(b.text for b in plan.content if b.type=="text").splitlines()
                if l.strip()]
    results = []
    for st in subtasks:
        w = client.messages.create(model=model, max_tokens=256, system=system,
            messages=[{"role":"user","content":f"Worker task: {st}\nDo it and report briefly."}])
        results.append((st, "".join(b.text for b in w.content if b.type=="text")))
    synth = client.messages.create(model=model, max_tokens=512, system=system,
        messages=[{"role":"user","content":"Combine these worker results into one plan:\n" +
                   "\n".join(f"- {s}: {r}" for s,r in results)}])
    final = "".join(b.text for b in synth.content if b.type=="text")
    return {"subtasks": subtasks, "results": results, "final": final}

Notice the shape of the return value: subtasks (what the orchestrator decided to split into), results (the raw per-worker outputs, so you can inspect or log them), and final (the merged plan). Keeping all three around means you can see exactly how the answer was assembled — which subtask produced which piece — not just the finished result.


Running It: One Goal, Three Workers, One Plan

Give it the goal “Plan a 3-day autumn trip to Japan.” The orchestrator’s plan call returns three subtasks, a worker runs on each, and synthesis merges them:

out = orchestrate(client, "Plan a 3-day autumn trip to Japan.",
                  system="You are Atlas, an orchestrator.")

print(out["subtasks"])
# ['Research the destination', 'Estimate the budget', 'Draft the itinerary']
print(len(out["results"]))   # 3
print(out["final"])
# 3-day Kyoto autumn trip, ~$90/day: temples, Arashiyama, markets.

This run is verified against an SDK-shaped mock — a stand-in that mirrors the real Claude API surface (client.messages.create, response.content, text blocks), so no API key is needed. The check confirms the structure: the plan call yields exactly the three subtasks above, three worker results come back, synthesis produces one final plan, and — the number that pins the whole pattern down — there were exactly 5 model calls: 1 plan + 3 workers + 1 synthesis, with "Arashiyama" surviving from a worker’s output into the final plan.

The counts are real; the wording is illustrative

What’s verified is the orchestration: five calls in the 1-plan / 3-workers / 1-synthesis shape, three collected results, and a worker fact (“Arashiyama”) propagating into the synthesized plan. The actual sentences the model writes — “Kyoto in autumn, mild and scenic,” “~$90/day” — are examples; exact wording varies every run. When you point orchestrate at the real Claude API, the plumbing is identical; only the natural-language text changes.


Workers Are Independent — So Run Them in Parallel

Look closely at the worker loop: each worker call depends only on its own subtask. No worker reads another worker’s output. The budget estimate doesn’t wait for the research; the itinerary draft doesn’t wait for the budget. Only synthesis needs all of them — and it waits for the whole set anyway. That independence is the defining property of this pattern, and it has a big consequence: the workers can run in parallel.

The verified code runs them sequentially, which is easier to read and reason about. But because the calls don’t touch each other, you can fan them out concurrently — with asyncio and the async client, or a thread pool — and gather the results:

import asyncio

async def run_worker(client, st, *, system, model="claude-haiku-4-5"):
    w = await client.messages.create(model=model, max_tokens=256, system=system,
        messages=[{"role":"user","content":f"Worker task: {st}\nDo it and report briefly."}])
    return (st, "".join(b.text for b in w.content if b.type=="text"))

# after the plan call produces `subtasks`:
results = await asyncio.gather(*(run_worker(client, st, system=system) for st in subtasks))

This is safe precisely because the workers share no state and no ordering — running them at the same time can’t change any single result. And the payoff is real: instead of wall-clock time being the sum of the workers (research + budget + itinerary), it becomes the time of the slowest single worker. Three roughly equal workers finish in about a third of the sequential time. This is the “sectioning” form of parallelization — split a task into independent sections, run them together, then combine.

When NOT to use orchestrator-workers

This pattern shines when the pieces are independent. If the subtasks depend on each other in sequence — step 2 needs step 1’s output, step 3 needs step 2’s — that’s plan-then-execute (Module 5), not independent workers. You can’t parallelize a dependency chain, and a synthesis-of-independent-results won’t hold a sequential plan together. Ask: “could these run at the same time without breaking?” If yes, orchestrator-workers. If no, reach for plan-then-execute.


Practice Exercises

Exercise 1: Count the calls

For the goal “Plan a 3-day autumn trip to Japan,” orchestrate produced three subtasks. Exactly how many model calls did the whole run make, and what was each for?

Hint

Five: 1 plan call (decompose the goal into subtasks) + 3 worker calls (one per subtask) + 1 synthesis call (merge the results into one plan). In general the count is 2 + N for N subtasks — the plan and synthesis are fixed, the workers scale with the decomposition.

Exercise 2: Would parallelizing change the answer?

You switch the worker loop to asyncio.gather so all three workers run at once. Could that change the final plan compared to running them sequentially? Why or why not?

Hint

No. Each worker call depends only on its own subtask and shares no state with the others, so the order they run in — or whether they overlap — can’t affect any individual result. Synthesis waits for all of them either way. Wall-clock time drops to the slowest worker, but the content is identical. That’s exactly what makes parallelizing safe here.

Exercise 3: Right pattern for the job?

Task A: “research a city, estimate its budget, and draft an itinerary, then merge into one plan.” Task B: “book a flight, then reserve a hotel for the flight’s dates, then plan activities around the hotel’s location.” Which is orchestrator-workers and which is plan-then-execute?

Hint

Task A is orchestrator-workers — the three pieces are independent (research doesn’t need the budget, the budget doesn’t need the itinerary), so they can run in parallel and be synthesized. Task B is plan-then-execute — each step consumes the previous step’s output (hotel dates depend on the flight, activities depend on the hotel), a dependency chain you can’t parallelize or synthesize away.


Summary

When a goal breaks cleanly into independent pieces, orchestrator-workers decomposes it: a plan call splits the goal into subtasks, a worker call runs on each subtask, and a synthesis call merges the worker outputs into one answer. That’s 2 + N model calls for N subtasks — verified here at exactly 5 (1 plan + 3 workers + 1 synthesis) against an SDK-shaped mock, with a worker fact surviving into the final plan. Because the workers are independent, they can run in parallel (asyncio or threads), turning wall-clock time from the sum of the workers into the slowest single worker — the “sectioning” form of parallelization. Reach for this pattern when the pieces don’t depend on each other; when they form a dependency chain, that’s plan-then-execute instead.

Key Concepts

  • Decompose, work, synthesize — one plan call, one worker call per subtask, one synthesis call to merge.
  • Many, not one — routing picks a single specialist; orchestrator-workers runs many workers and combines them.
  • Independence enables parallelism — no worker reads another’s output, so they can run concurrently and finish in the time of the slowest one.
  • Not for sequential jobs — dependency chains are plan-then-execute (Module 5), not independent workers.

Why This Matters

Orchestrator-workers is how you tackle a goal that’s really several jobs at once — the shape of most “plan me a whole X” requests. Getting the phases right (a clean decomposition, a focused worker per piece, an honest synthesis) turns a sprawling prompt into a set of small, checkable calls you can log and inspect. And spotting that the workers are independent unlocks parallelism for free: the same code, fanned out, answers in a fraction of the wall-clock time. Just as important is knowing the pattern’s edge — when subtasks depend on each other, forcing them through orchestrator-workers produces incoherent plans, and plan-then-execute is the right tool. Next, you’ll bring routing, agents-as-tools, and orchestration together into one small team: a multi-agent Atlas.


Next Steps

Continue to Lesson 5 - Guided Project: A Multi-Agent Atlas

Assemble routing, agents-as-tools, and orchestration into one small coordinated team.

Back to Module Overview

Return to the Multi-Agent Systems module overview


Continue Building Your Skills

You can now decompose a goal into independent subtasks, run a worker on each, and synthesize their outputs into one plan — and you know when to run those workers in parallel and when a sequential dependency chain means you want plan-then-execute instead. Next you’ll combine every pattern in this module into a single multi-agent Atlas that routes, delegates, and orchestrates as the job demands.