Lesson 5 - Guided Project: A Multi-Agent Atlas
Welcome to the Guided Project
Across this module you took multi-agent systems apart pattern by pattern: a specialist agent can be wrapped as a plain tool its supervisor calls (Lesson 2); a lightweight router can classify a request and dispatch it to exactly one specialist (Lesson 3); and an orchestrator can decompose a goal into subtasks, run a worker on each, and synthesize the results (Lesson 4). Every one of them was built on the same run_agent loop you’ve had since Module 2 — no new engine. Now you’ll put them back together on one agent. Atlas, the trip-planning assistant you’ve been building all course, becomes a team lead: an orchestrator coordinating three specialists, delegating its hardest sub-problem (grounded research) to a retrieval sub-agent, routing simple questions to the right expert, and merging everything into a single, cited plan.
By the end of this project, you will be able to:
- Wrap a retrieval-grounded researcher sub-agent as a
research_destinationtool the orchestrator can call - Route an incoming request to the right specialist, with a safe fallback for anything unknown
- Orchestrate a full trip plan: decompose into subtasks, run a worker per subtask, and synthesize
- Combine all three patterns so one planning turn is coordinated and grounded in retrieved, cited facts
We’ll build it in stages, reusing the exact pieces you wrote earlier in the module. Let’s assemble the team.
Stage 1: Meet the Team
Atlas has outgrown being a single generalist. Planning a trip well means researching a destination, estimating a budget, and drafting a day-by-day itinerary — three different jobs, each wanting its own focused prompt and its own tools. Cram them into one agent and you get the bloat you met in Lesson 1: a sprawling prompt, a long tool list, and muddled context. So Atlas becomes an orchestrator coordinating three specialists:
- Researcher — a retrieval-grounded sub-agent. Its only job is to answer factual questions about a place by searching a knowledge base and citing sources. It carries the
search_knowledgeretrieval tool from Module 6 and nothing else. - Budget analyst — estimates daily and total costs. Focused prompt, cost-oriented tools, no opinions on temples.
- Itinerary writer — turns research and budget into a concrete day-by-day plan. It writes; it doesn’t research or price.
The division of labor is the whole point: each agent stays sharp because it has one job, a tight prompt, and a small tool set. Atlas’s job is coordination — deciding who handles what and merging the pieces. The rest of this project wires those responsibilities up with the three patterns you already built.
Stage 2: The Researcher as a Tool
Start with the hardest specialist. Facts about a destination — when to visit, what to skip — should come from a knowledge base, not the model’s memory, so they’re grounded and citable. That’s the agents-as-tools pattern from Lesson 2: the researcher is a full sub-agent running its own loop with a retrieval tool, but Atlas sees it as one ordinary tool, research_destination. The tiny keyword KB and search_knowledge come straight from Module 6’s retrieval work.
kb = KB()
kb.add("kyoto-guide", "Arashiyama and the bamboo grove are best visited early morning to avoid crowds.")
# --- the specialist SUB-AGENT: a destination researcher with a retrieval tool ---
def search_knowledge(query):
hits = kb.search(query, k=1)
return hits[0][0] + ": " + hits[0][1] if hits else "No match."
researcher_tools = [{"name": "search_knowledge", "description": "Search travel KB.",
"input_schema": {"type": "object",
"properties": {"query": {"type": "string"}},
"required": ["query"]}}]
def research_destination(question):
"""Expose the whole researcher agent to the supervisor as ONE tool."""
out = run_agent(researcher_client, question,
system="You are a destination researcher. Search the KB and cite sources.",
tools=researcher_tools, tool_functions={"search_knowledge": search_knowledge})
return out["answer"]Now Atlas calls research_destination(...) like any tool. Behind it, the researcher runs its own two-step loop: search the KB, then answer with the retrieved fact. The verified run has the supervisor delegate once, the sub-agent run its nested loop, and the grounded fact surface in Atlas’s final answer:
supervisor answer: Plan: go to Arashiyama early morning to beat the crowds [kyoto-guide].The [kyoto-guide] citation is the tell: that fact came up from the researcher sub-agent’s retrieval, not from Atlas guessing. One agent’s grounded work became another agent’s tool result. (The model text here is an illustrative example — exact wording varies.)
Stage 3: Routing the Request
Not every request needs the full team. “Where should I eat vegetarian in Kyoto?” is a food question — send it straight to the food specialist and skip the orchestration. That’s the routing pattern from Lesson 3: one cheap classifier call reads the request, picks a label, and dispatches to exactly one specialist. Crucially, if the classifier returns something you don’t recognize, you fall back to a general specialist instead of crashing.
def route(client, request, specialists, *, model="claude-haiku-4-5"):
labels = ", ".join(specialists)
r = client.messages.create(model=model, max_tokens=16, system="You are a router.",
messages=[{"role": "user", "content":
f"Classify this request into exactly one of: {labels}. "
f"Reply with only the label.\n\nRequest: {request}"}])
label = "".join(b.text for b in r.content if b.type == "text").strip().lower()
if label not in specialists:
label = "general"
return {"label": label, "answer": specialists[label](request)}
specialists = {
"weather": lambda q: "[weather] check the forecast",
"budget": lambda q: "[budget] estimate daily costs",
"food": lambda q: "[food] recommend restaurants",
"general": lambda q: "[general] I can help with that",
}
out = route(client, "Where should I eat vegetarian in Kyoto?", specialists)The verified run classifies the food question correctly and dispatches it — and when a second run returns a nonsense label (astrology), the router safely falls back to general:
routed to: food -> [food] recommend restaurants
routed unknown -> generalThat fallback is what makes routing production-safe: the classifier is a model call and models occasionally return something off-menu, so you never trust the label blindly. Routing keeps simple requests cheap; the next stage handles the ones that need the whole team.
Stage 4: Orchestrating the Full Plan
“Plan a 3-day trip” isn’t one question — it’s three, and they don’t all go to one specialist. That’s the orchestrator-workers pattern from Lesson 4: Atlas decomposes the goal into subtasks, runs a worker on each, then synthesizes one plan from the pieces.
def orchestrate(client, goal, *, system, model="claude-haiku-4-5"):
plan = client.messages.create(model=model, max_tokens=256, system=system,
messages=[{"role": "user", "content": f"Break this goal into 2-4 worker subtasks, "
f"one per line:\n{goal}"}])
subtasks = [l.strip("-• ").strip()
for l in "".join(b.text for b in plan.content if b.type == "text").splitlines()
if l.strip()]
results = []
for st in subtasks:
w = client.messages.create(model=model, max_tokens=256, system=system,
messages=[{"role": "user", "content": f"Worker task: {st}\nDo it and report briefly."}])
results.append((st, "".join(b.text for b in w.content if b.type == "text")))
synth = client.messages.create(model=model, max_tokens=512, system=system,
messages=[{"role": "user", "content": "Combine these worker results into one plan:\n" +
"\n".join(f"- {s}: {r}" for s, r in results)}])
final = "".join(b.text for b in synth.content if b.type == "text")
return {"subtasks": subtasks, "results": results, "final": final}
out = orchestrate(client, "Plan a 3-day autumn trip to Japan.", system="You are Atlas, an orchestrator.")The verified run makes exactly five model calls — one to decompose, one per worker, one to synthesize — and produces three subtasks that map cleanly onto our three specialists:
subtasks: ['Research the destination', 'Estimate the budget', 'Draft the itinerary']
final: 3-day Kyoto autumn trip, ~$90/day: temples, Arashiyama, markets.Notice that the three workers are independent — researching the destination doesn’t depend on the budget estimate — which means they could run in parallel with asyncio to cut latency (Exercise 2). The decompose and synthesize steps bracket them: one call to split the goal, one to merge the results.
Stage 5: Put It Together
Now watch one grounded, coordinated planning turn use all three patterns at once. A traveler asks Atlas to plan a 3-day autumn trip to Kyoto:
- Decompose. Atlas the orchestrator breaks the goal into three subtasks: research the destination, estimate the budget, draft the itinerary (Stage 4).
- Delegate research to the sub-agent. The research worker doesn’t answer from memory — it calls
research_destination(...), which runs the retrieval sub-agent from Stage 2. That worker’s piece comes back grounded and cited: “visit Arashiyama early morning to avoid crowds [kyoto-guide].” This is the agents-as-tools pattern nested inside an orchestrator worker. - Route the stray question. When the traveler adds “and where should I eat?”, Atlas routes that one to the food specialist (Stage 3) instead of spinning up the whole team — the right tool for a small job.
- Synthesize. Atlas merges the workers’ results into one plan: “3-day Kyoto autumn trip, ~$90/day: temples, Arashiyama, markets.” The research is cited, the budget is concrete, the itinerary is day-by-day — three specialists, one coherent answer.
That’s the shape of a real multi-agent system: routing, orchestration, and delegation working together, with retrieval keeping the facts honest. And it composes with everything before it — the memory from Module 4 lets Atlas recall the traveler’s preferences before it plans, and the retrieval from Module 6 is exactly what grounds the researcher sub-agent. Atlas is no longer one overloaded generalist; it’s a coordinated team.
Real agents combine all three
The three patterns aren’t rivals — production systems layer them. A top-level router decides whether a request is a quick lookup or a full job; a full job goes to an orchestrator that decomposes it; and individual subtasks are handled by specialist sub-agents exposed as tools, some of which (like the researcher) do their own retrieval. But every added agent is another model call, more latency, and another place to fail. Keep the team as small as the problem needs — reach for a second agent only when one is measurably struggling, not because a bigger org chart looks impressive. The best multi-agent system is the smallest one that solves your problem.
Practice Exercises
Exercise 1: Add a fourth specialist
Atlas has a researcher, budget analyst, and itinerary writer. Add a transport planner that figures out how to get between cities (train vs. flight). Give it its own focused prompt and register it so both the router and the orchestrator can reach it.
Hint
Add a "transport" entry to the specialists dict for routing, and make the orchestrator’s decomposition able to emit a “plan transport” subtask. The specialist itself is just another focused agent — a tight system prompt (“You plan intercity travel; compare train and flight by time and cost”) and, if you want it grounded, its own retrieval tool over a transport KB. The pattern is identical to the researcher; only the prompt and tools change.
Exercise 2: Parallelize the workers with asyncio
The orchestrator runs its workers in a for loop — one after another. Because the workers are independent, they can run concurrently. Rewrite the worker loop with asyncio so all three run at once, then synthesize once they’ve all returned.
Hint
Use the async client (AsyncAnthropic), make each worker call an awaited coroutine, and gather them with asyncio.gather(*tasks). The decompose step still runs first (you need the subtasks) and the synthesis step still runs last (it needs all results), but the middle — the workers — fires in parallel. For three independent workers this roughly cuts the worker phase from three sequential calls down to one call’s worth of wall-clock time.
Exercise 3: Add an evaluator agent
Before Atlas returns the synthesized plan, add an evaluator agent that checks it: does it cover all three days, stay within the stated budget, and cite its facts? If the plan fails a check, send it back for a revision.
Hint
The evaluator is one more specialist — a focused agent whose prompt is “Check this plan against these criteria and reply PASS or explain what’s missing.” Run it on out["final"] after synthesis. If it returns anything other than PASS, feed its critique back into a revision call. This “generate, then check” loop is exactly what Module 8 is about — guardrails and evaluation are what separate a demo from something you’d trust in production.
Summary
You assembled all three multi-agent patterns on one agent. Atlas became an orchestrator coordinating three focused specialists — a researcher, a budget analyst, and an itinerary writer — instead of one overloaded generalist. You wrapped the researcher as a research_destination tool (Lesson 2), so its retrieval sub-agent runs its own loop and returns grounded, cited facts. You added a router (Lesson 3) that classifies a request and dispatches it to one specialist, with a safe fallback for unknown labels. And you used the orchestrator (Lesson 4) to decompose “plan a 3-day trip” into independent subtasks, run a worker on each, and synthesize one plan. In the final stage they combined: the orchestrator planned the subtasks, the research worker delegated to the retrieval sub-agent so its piece was cited, and synthesis merged everything into “3-day Kyoto autumn trip, ~$90/day: temples, Arashiyama, markets.”
Every orchestration here was verified for real against an SDK-shaped mock (no API key): the agent-as-tool run confirmed the sub-agent’s nested loop and that its retrieved fact propagated up to the supervisor’s answer; the router confirmed correct dispatch and the safe fallback on an unknown label; and the orchestrator confirmed the exact five-call decompose→workers→synthesize sequence. The researcher’s knowledge base uses keyword embedding, not semantic search — the same honest simplification as Module 6; in production you’d swap embed() for sentence-transformers over a vector store like chromadb (see the DataTweets Generative AI course), and none of the coordination code would change. Claude’s calls are mock-verified because there’s no ANTHROPIC_API_KEY in the environment, so the illustrative model text is labeled “example — exact wording varies.”
Key Concepts
- Orchestrator plus specialists — one coordinator agent delegates to focused sub-agents, each with its own tight prompt and small tool set.
- Researcher as a tool — a retrieval-grounded sub-agent exposed as
research_destination, so its cited facts flow up to the orchestrator. - Route then orchestrate — a router sends simple requests to one specialist; full goals go to the orchestrator to decompose.
- Independent workers synthesize — decompose into independent subtasks (parallelizable), run a worker on each, then merge into one plan.
Why This Matters
This is what a real multi-agent system looks like: not a pile of agents, but routing, orchestration, and delegation layered so each piece stays focused and the whole thing stays coordinated. Grounding the researcher with retrieval keeps the facts honest; keeping the team small keeps the coordination cost in check. Atlas now plans a trip the way a good team would — the right specialist on each job, one clear plan out the other end. What’s left is making it trustworthy: catching bad tool calls, retrying failures, capping cost, and evaluating the output. That’s the next module.
Next Steps
Continue to Module 8 - Reliability and Evaluation
Guardrails, retries, cost control, and evaluation — what separates a demo from production.
Back to Module Overview
Return to the Multi-Agent Systems module overview
Continue Building Your Skills
Atlas can now coordinate a team — routing simple questions, orchestrating full plans, and delegating grounded research to a retrieval sub-agent — and merge it all into one cited answer. But a coordinated system is still only a demo until it’s reliable: it needs guardrails so a specialist can’t go off the rails, retries so a flaky call doesn’t sink the plan, cost control so the extra model calls stay affordable, and evaluation so you actually know the plan is good. That’s exactly where the evaluator agent from Exercise 3 was pointing. On to reliability and evaluation.