Lesson 5 - Guided Project: Atlas's First Tool Loop

Welcome to the Guided Project

Across this module you’ve built every piece of a tool-using agent one at a time: you defined tools, read Claude’s tool_use requests, returned tool_result blocks, and wrapped the whole exchange in a loop that runs until Claude is done. This guided project is where those pieces stop being separate exercises and become one program. By the end you’ll have a single runnable file that gives Atlas — your trip-planning assistant — two real tools and a loop that lets Claude reach for them on its own to answer a question that needs both.

By the end of this project, you will be able to:

  • Write real tool functions and pair each one with a tool definition Claude can read
  • Build a TOOL_FUNCTIONS registry that maps a tool name to the Python function that runs it
  • Wire the agent loop to the real Anthropic client and run it on a multi-part question
  • Read the result the loop returns and trace the control flow that produced it

We’ll build it in five stages, each adding one part of the final file. Let’s start with the tools themselves.


Stage 1: The Real Tool Functions

A tool is two things: a plain Python function that does the work, and a definition that tells Claude the function exists. Stage 1 is the functions. These are ordinary Python — nothing about them knows it’s being called by a model:

def get_weather(city):
    table = {"Kyoto": "16°C, clear and crisp", "Lisbon": "19°C, sunny"}
    return table.get(city, "no data for that city")

def convert_currency(amount, from_currency, to_currency):
    rates = {("JPY", "USD"): 0.0067, ("USD", "EUR"): 0.92}
    rate = rates[(from_currency, to_currency)]
    return f"{amount} {from_currency} = {round(amount * rate, 2)} {to_currency}"

TOOL_FUNCTIONS = {"get_weather": get_weather, "convert_currency": convert_currency}

Two things to notice. First, the functions take normal Python arguments (city, amount, from_currency, to_currency) — these are the names Claude will fill in when it calls them, so they have to match the schemas you write next. Second, TOOL_FUNCTIONS is a registry: a dictionary from each tool’s name to the function that runs it. When Claude asks for get_weather, the loop will look up "get_weather" in this dict to find the actual function. That lookup is the bridge between Claude’s request and your code.

These are toy implementations — a lookup table for weather, two hardcoded exchange rates — and that’s deliberate. The point of this project is the loop, not the data source. In Module 3 you’ll replace these bodies with real APIs without changing anything else.


Stage 2: The Tool Definitions and the Registry

Claude can’t see your Python functions; it only sees their definitions. Each definition gives the tool a name (which must match a key in TOOL_FUNCTIONS), a description Claude reads to decide when to use it, and an input_schema describing the arguments:

TOOLS = [
    {"name": "get_weather", "description": "Get the current weather for a city.",
     "input_schema": {"type": "object",
        "properties": {"city": {"type": "string"}}, "required": ["city"]}},
    {"name": "convert_currency", "description": "Convert an amount between currencies.",
     "input_schema": {"type": "object", "properties": {
        "amount": {"type": "number"}, "from_currency": {"type": "string"},
        "to_currency": {"type": "string"}}, "required": ["amount", "from_currency", "to_currency"]}},
]

The names line up exactly with the functions from Stage 1 — get_weather and convert_currency — and the schema property names (city, amount, from_currency, to_currency) line up with each function’s parameters. That alignment is what makes the registry work: Claude returns block.name and block.input, the loop looks up block.name in TOOL_FUNCTIONS, and calls the function with **block.input. The schema keys are the function’s keyword arguments.

So at this point you have two parallel structures that must agree: TOOLS (what Claude sees) and TOOL_FUNCTIONS (what your code runs). Keeping them in sync — one definition and one function per tool, sharing a name — is the small discipline that keeps an agent’s tool surface honest.


Stage 3: The Agent Loop

This is the loop you built in Lesson 3, presented again in full. It’s the engine: it sends the conversation to Claude, and as long as Claude keeps asking for tools, it runs them and feeds the results back. It stops the moment Claude answers in plain text:

def run_agent(client, user_message, *, system, tools, tool_functions,
              model="claude-haiku-4-5", max_steps=8):
    messages = [{"role": "user", "content": user_message}]
    for step in range(1, max_steps + 1):
        response = client.messages.create(
            model=model, max_tokens=1024, system=system, tools=tools, messages=messages)
        messages.append({"role": "assistant", "content": response.content})

        if response.stop_reason != "tool_use":
            final = "".join(b.text for b in response.content if b.type == "text")
            return {"answer": final, "steps": step, "messages": messages}

        tool_results = []
        for block in response.content:
            if block.type != "tool_use":
                continue
            fn = tool_functions.get(block.name)
            try:
                if fn is None:
                    raise KeyError(f"no such tool: {block.name}")
                result = fn(**block.input)
                tool_results.append({"type": "tool_result", "tool_use_id": block.id,
                                     "content": str(result)})
            except Exception as exc:
                tool_results.append({"type": "tool_result", "tool_use_id": block.id,
                                     "content": f"Error: {exc}", "is_error": True})
        messages.append({"role": "user", "content": tool_results})
    return {"answer": "Stopped: reached the step limit.", "steps": max_steps, "messages": messages}

Walk through what it does each step. It calls client.messages.create with the running messages list, appends Claude’s reply to that list, and checks stop_reason. If it’s anything other than "tool_use", Claude is done — the loop joins the text blocks into a final answer and returns. Otherwise it walks the tool_use blocks, looks each block.name up in tool_functions, runs the function with **block.input, and collects a tool_result for each (echoing block.id so Claude can match the result to its request). A try/except turns any failure — an unknown tool, a bad argument — into a tool_result marked is_error, so one broken call can’t crash the whole agent. The results go back as a single user message, and the loop repeats. The max_steps cap guarantees it always terminates.

Notice the loop is completely generic. It hardcodes nothing about weather or currency — it takes tools and tool_functions as arguments. That’s why the same loop will carry every agent you build for the rest of the course.


Stage 4: Wiring It to the Real Client and Running It

Now connect the loop to the real Anthropic client and run it. The client reads your API key from the environment — never hardcode a key in the file:

from anthropic import Anthropic

client = Anthropic()  # reads ANTHROPIC_API_KEY from the environment
SYSTEM = "You are Atlas, a concise trip-planning assistant."

result = run_agent(
    client,
    "What's the weather in Kyoto, and what's 20000 JPY in USD?",
    system=SYSTEM, tools=TOOLS, tool_functions=TOOL_FUNCTIONS,
)
print(result["answer"])
print("steps:", result["steps"])

This is the whole program coming together. Anthropic() constructs the client by reading ANTHROPIC_API_KEY from your environment — set it with export ANTHROPIC_API_KEY=... in your shell before running, so the key never lives in your code or your git history. The question deliberately needs both tools: “What’s the weather in Kyoto” is a get_weather job, and “what’s 20000 JPY in USD” is a convert_currency job. You pass TOOLS and TOOL_FUNCTIONS straight into the loop, and run_agent does the rest — Claude decides which tools to call, the loop runs them, and you get back a dictionary with the final answer and the step count.


Stage 5: Reading the Result

When you run it, the loop drives the full exchange and returns. Here is the verified trace from running it on that exact request:

  step 1: Claude calls get_weather({'city': 'Kyoto'})
  step 1: Claude calls convert_currency({'amount': 20000, 'from_currency': 'JPY', 'to_currency': 'USD'})
  final answer: Kyoto is 16°C, clear and crisp, and 20000 JPY is about 134.0 USD.
  steps taken: 2
  messages in transcript: 4

Read the control flow, because that’s what’s guaranteed (the exact wording of the final answer varies from run to run; the shape of the exchange does not). In step 1, Claude looked at the question and asked for both tools at once — get_weather for Kyoto and convert_currency for the JPY-to-USD conversion — in a single response. Your loop ran both functions and fed both results back as one user message. In step 2, Claude had everything it needed, so it answered in plain text and the loop stopped. The transcript ended up four messages long: your question, Claude’s two tool requests, your two tool results, and Claude’s final answer.

That’s the entire agent loop in two steps: Claude requests, your code runs, Claude answers.

This loop is the skeleton every later module extends

What you just built is the foundation the rest of the course is built on — and the loop itself barely changes. Module 3 makes the tools robust: real APIs behind the functions, schemas Claude uses well, validation with Pydantic, and proper error-and-repair. Module 4 gives Atlas memory across turns. Module 5 adds planning so it can tackle bigger tasks. Each module fleshes out what goes into the loop — better tools, remembered context, a plan — but the request → run → answer cycle in run_agent stays exactly the same. Get this skeleton firmly in mind and everything later slots into it.


Extend Atlas

Exercise 1: Add a third tool

Give Atlas a local_time(city) tool that returns a city’s current time. You’ll need to do it in two places that must agree.

Hint

Write the function and add it to the registry, then add a matching definition to TOOLS:

def local_time(city):
    table = {"Kyoto": "11:20 PM", "Lisbon": "3:20 PM"}
    return table.get(city, "no data for that city")

TOOL_FUNCTIONS["local_time"] = local_time

TOOLS.append({
    "name": "local_time",
    "description": "Get the current local time in a city.",
    "input_schema": {"type": "object",
        "properties": {"city": {"type": "string"}}, "required": ["city"]},
})

The name in the definition (local_time) must match the registry key, and the schema property (city) must match the function’s parameter. Nothing in run_agent changes — the loop already looks tools up by name.

Exercise 2: Ask a question that chains both tools

Send Atlas a question where one answer feeds the next — for example, converting a price and then comparing it. Watch how many steps the loop takes.

Hint

Try a prompt like "Convert 20000 JPY to USD, then tell me what that is in EUR." The rates table only has ("JPY", "USD") and ("USD", "EUR"), so Claude has to call convert_currency once to get USD, read the result, then call it again to go USD→EUR. That’s a genuine multi-step run: the loop will take more than two steps because Claude needs the first result before it can make the second request.

Exercise 3: Lower max_steps and watch the cap

Call run_agent with max_steps=1 on the two-tool question and look at what comes back.

Hint

result = run_agent(
    client, "What's the weather in Kyoto, and what's 20000 JPY in USD?",
    system=SYSTEM, tools=TOOLS, tool_functions=TOOL_FUNCTIONS, max_steps=1)

With only one step allowed, Claude makes its tool requests in step 1, but the loop never gets a second pass to send the results back and let Claude answer — so it hits the cap and returns {"answer": "Stopped: reached the step limit.", ...}. This is the safety valve in action: max_steps guarantees the loop terminates even if an agent would otherwise keep going. Raise it back to a value like 8 for real runs.


Summary

You assembled Atlas’s first working tool-using agent from the pieces built across this module. The tool functions (get_weather, convert_currency) do the real work in plain Python; the TOOL_FUNCTIONS registry maps each tool’s name to its function; the TOOLS definitions tell Claude which tools exist and when to use them, with names and schemas that line up exactly with the functions. The run_agent loop ties it together — sending the conversation to Claude, running any requested tools by looking them up in the registry, feeding results back, and stopping when Claude answers in plain text. Wired to a real Anthropic() client (with the key loaded from the environment) and run on a two-part question, Claude called both tools in step 1 and answered in step 2 — a four-message transcript. What’s guaranteed is that control flow, not the exact wording of the answer.

Key Concepts

  • Tool functions — ordinary Python functions that do the work; they don’t know they’re called by a model.
  • TOOL_FUNCTIONS registry — a name → function dict the loop uses to find the function for a tool Claude requests.
  • TOOLS definitions — the name / description / input_schema Claude reads; names and schema keys must match the functions and the registry.
  • run_agent loop — the generic engine: request → run tools → return results → repeat until stop_reason isn’t "tool_use", bounded by max_steps.
  • Environment-loaded keyAnthropic() reads ANTHROPIC_API_KEY from the environment; never hardcode a key.

Why This Matters

This is the moment the abstract “model in a loop” becomes a concrete file you can run. Everything you build for the rest of the course extends this exact skeleton: later modules swap in real APIs, add memory, and add planning, but they all plug into the same run_agent cycle of request, run, and answer. Once you can see the whole agent as these five parts — functions, registry, definitions, loop, and a wired-up run — you have a mental model that scales to far more capable agents without getting more complicated at its core.


Next Steps

Continue to Module 3 - Designing Tools

Schemas Claude uses well, validation with Pydantic, error and repair, and parallel tools.

Back to Module Overview

Return to The Agent Loop with Claude module overview


Continue Building Your Skills

You now have a complete, runnable tool-using agent — two real tools, their definitions, a registry, and the loop that drives them. From here the work is about making each part stronger: tools that talk to real services and fail gracefully, definitions Claude reads precisely, memory that carries context between turns, and planning for longer tasks. The loop stays the same; what flows through it gets richer. That’s exactly where Module 3 picks up.