Lesson 3 - The Tool-Use Loop

Welcome to The Tool-Use Loop

In Lesson 1 you walked through a single round-trip by hand: the model asked for one tool, you ran it, you sent the result back, and it answered. That worked because the question needed exactly one tool, exactly once. Real questions are messier. A user might ask for a calculation and the current time, or ask a follow-up that needs a second tool after seeing the first result. One hand-written exchange won’t cover that.

In this lesson you’ll turn that one-shot round-trip into a real loop: a small, reusable function that keeps running tools and feeding results back until the model is finished — no matter how many rounds it takes.

By the end of this lesson, you will be able to:

  • Write a dispatcher that maps a tool name to the Python function that runs it
  • Append the assistant turn and all tool_result blocks correctly each round
  • Loop until stop_reason == "end_turn" instead of stopping after one call
  • Use the SDK tool runner to drive the loop for you when you don’t need full control

You’ll reuse the SDK setup and the tool_use / end_turn stop reasons from Lesson 1. Let’s begin.


Why One Round-Trip Isn’t Enough

The round-trip in Lesson 1 had a fixed shape: send the message, get one tool_use, run it, send the result, get the answer. But the model decides how many tools it needs, and that number isn’t always one.

Consider asking for a product and a price at once: “What is 4827 times 3916?” needs a single calculator call. But “What is 1487 times 932, and what time is it in Tokyo?” needs two tools in the same turn. And some questions chain: the model calls one tool, reads the result, then decides it needs a second tool before it can answer. Your code can’t know in advance which case you’re in.

The fix is to stop thinking in “one round-trip” and start thinking in a loop. Each pass through the loop is one call to the model. If the model asks for tools, you run every tool it requested, send all the results back, and loop again. If it doesn’t, you’re done. The same loop handles one tool, five tools, or several rounds of tools — you never special-case the count.

The condition that controls the loop

You loop while stop_reason == "tool_use" and stop the moment you see "end_turn". Those are the same two stop reasons from Lesson 1 — the only new idea here is repeating the run-and-return step instead of doing it once.


The Dispatcher

The model’s request gives you a tool name (a string like "calculator") and an input dictionary. You need a way to turn that name into the actual Python function and run it. Hard-coding if name == "calculator": ... works for one tool but rots fast. A cleaner pattern is a dispatcher: a plain dictionary that maps each tool name to the function that implements it.

Two small real tools

We’ll use two tools you can actually run. A calculator that does exact math (the model is unreliable at big multiplication, so this is a genuine fix), and a get_time that returns the real current time from Python’s datetime:

def calculator(expression):
    return str(eval(expression, {"__builtins__": {}}, {}))


def get_time(timezone):
    from datetime import datetime
    from zoneinfo import ZoneInfo
    return datetime.now(ZoneInfo(timezone)).strftime("%H:%M:%S %Z")


dispatch = {
    "calculator": calculator,
    "get_time": get_time,
}

# Simulate a tool_use block from the model:
name, tool_input = "calculator", {"expression": "12 * (4 + 3)"}
result = dispatch[name](**tool_input)
print(result)
84

Two things make this work. First, dispatch[name] looks up the function by the model’s string. Second, **tool_input unpacks the input dictionary into keyword arguments, so {"expression": "12 * (4 + 3)"} becomes the call calculator(expression="12 * (4 + 3)"). Because we named each function’s parameters to match the schema’s property names, the model’s input plugs straight in.

eval is for the lesson, not production

We use eval with an empty builtins namespace to keep the calculator tiny, and it’s still risky on untrusted input. In a real app, use a safe expression parser (or a math library) instead of eval. The point here is the loop, not the calculator.


Building the Loop by Hand

Now assemble the full loop. The structure is the round-trip from Lesson 1, wrapped in a while that keeps going as long as the model keeps asking for tools.

One reusable function

import anthropic

client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY from your environment

tools = [
    {
        "name": "calculator",
        "description": "Evaluate an arithmetic expression and return the exact result.",
        "input_schema": {
            "type": "object",
            "properties": {"expression": {"type": "string"}},
            "required": ["expression"],
        },
    },
    {
        "name": "get_time",
        "description": "Get the current time in a given IANA timezone, e.g. 'Asia/Tokyo'.",
        "input_schema": {
            "type": "object",
            "properties": {"timezone": {"type": "string"}},
            "required": ["timezone"],
        },
    },
]


def run_conversation(question):
    messages = [{"role": "user", "content": question}]
    while True:
        response = client.messages.create(
            model="claude-haiku-4-5",
            max_tokens=400,
            tools=tools,
            messages=messages,
        )
        if response.stop_reason != "tool_use":
            return response.content[0].text

        # Append the model's turn, then run every tool it asked for.
        messages.append({"role": "assistant", "content": response.content})

        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                output = dispatch[block.name](**block.input)
                print(f"  [ran {block.name}{block.input} -> {output}]")
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": output,
                })
        messages.append({"role": "user", "content": tool_results})

Read the loop body carefully, because two details are where beginners slip.

1. Append the assistant turn before the results. messages.append({"role": "assistant", "content": response.content}) puts the model’s tool_use request into the history. The result blocks you send next refer back to those requests by id, so the request must already be in the conversation.

2. Collect all tool results into one user turn. When the model asks for two tools at once, it sends two tool_use blocks in a single response. You run both, gather both tool_result blocks into one list, and append them as a single user message. Sending them one at a time, or forgetting one, breaks the pairing — every tool_use must be answered by a matching tool_result in the very next turn.

The print line is just so you can watch the tools fire; it isn’t part of the protocol.

A real multi-tool run

Now call it with a question that needs both tools in one turn:

answer = run_conversation(
    "What is 1487 * 932, and what time is it right now in Tokyo?"
)
print(answer)
  [ran calculator{'expression': '1487 * 932'} -> 1385884]
  [ran get_time{'timezone': 'Asia/Tokyo'} -> 2026-06-28 01:24:28 JST]
Here are the answers:

- **1487 × 932 = 1,385,884**
- **Current time in Tokyo: 2026-06-28 01:24:28 JST** (1:24 AM and 28 seconds)

Look at what the loop did. The model returned two tool_use blocks in one response; the loop ran both, sent both results back in a single user turn, and on the next pass the model had everything it needed and produced end_turn. The same run_conversation would handle a question that needed only the calculator, or one that needed a second round of calls — the loop doesn’t care, because it just repeats until the model stops asking.

Always cap the loop in production

A bare while True trusts the model to eventually stop. In production, add a maximum-rounds counter (e.g. break after 10 iterations) so a misbehaving model or a tool that always fails can’t spin forever. You’ll harden the loop further in the next lesson.


The SDK Tool Runner: the Easier Path

The manual loop is worth writing once, because it shows you exactly what’s happening. But for simple cases the Anthropic SDK can drive that whole loop for you. This is the tool runner, a beta helper that calls your functions and feeds results back automatically until the model is done.

Tools as decorated functions

Instead of a separate schema and a dispatcher, you write a normal Python function and decorate it with @beta_tool. The SDK reads the function’s name, its docstring, and its type hints to build the schema — so the function is the tool:

import anthropic
from anthropic import beta_tool
from datetime import datetime
from zoneinfo import ZoneInfo

client = anthropic.Anthropic()


@beta_tool
def calculator(expression: str) -> str:
    """Evaluate an arithmetic expression and return the exact result.

    Args:
        expression: A Python arithmetic expression, e.g. "1487 * 932".
    """
    return str(eval(expression, {"__builtins__": {}}, {}))


@beta_tool
def get_time(timezone: str) -> str:
    """Get the current time in a given IANA timezone.

    Args:
        timezone: An IANA timezone name, e.g. "Asia/Tokyo".
    """
    return datetime.now(ZoneInfo(timezone)).strftime("%Y-%m-%d %H:%M:%S %Z")

The docstring becomes the tool description, the Args: lines describe each parameter, and expression: str becomes the input schema. No separate tools list, no dispatch dictionary.

Running until done

Hand the decorated functions to client.beta.messages.tool_runner, then call until_done(). The runner makes the first model call, runs any tools the model requests, sends the results back, and repeats — the same loop you wrote by hand, but inside the SDK:

runner = client.beta.messages.tool_runner(
    model="claude-haiku-4-5",
    max_tokens=400,
    tools=[calculator, get_time],
    messages=[{
        "role": "user",
        "content": "What is 1487 * 932, and what time is it right now in Tokyo?",
    }],
)

final = runner.until_done()
print(final.content[-1].text)
The results are:

- **1487 × 932 = 1,385,884**
- **Current time in Tokyo: 2026-06-28 01:25:58 JST** (Just after 1:25 AM)

Same answer, far less plumbing. The runner found both tool calls, executed your two functions, returned the results, and looped to a final message — and until_done() handed you that final message directly.

Manual loop vs. the runner

Both reach the same end state. Choose based on how much control you need:

  • The manual loop gives you full control. You see every tool_use block, you decide whether to run it, you can log, validate, ask for confirmation, rewrite arguments, or short-circuit before any function executes. When you build agents that gate dangerous actions, you’ll want this.
  • The runner gives you convenience. For simple, trusted tools where you just want an answer, it removes the boilerplate. The trade-off is that it runs your tools automatically, so you give up the inspection point the manual loop hands you each round.

A good rule: prototype with the runner, and drop down to the manual loop the moment you need to inspect, guard, or customize what happens between the model’s request and your function running. The tool runner is a beta feature, so pin your SDK version and expect the API to keep evolving.


Practice Exercises

Exercise 1: A calculator-only question

Call run_conversation("What is 4827 times 3916?"). Confirm the loop runs exactly one tool, then returns. How many times does the while loop call the model before it returns?

Hint

The loop calls the model twice: once to get the tool_use request, and once more after you send the tool_result to get the end_turn answer. The [ran calculator ...] line should print exactly once. The exact result is 18902532.

Exercise 2: Add a third tool

Add a get_weather tool to both the tools list and the dispatch dictionary (it can return a fixed string like "18°C and clear"). Then ask a question that needs all three at once, such as “What’s 100 / 8, the time in London, and the weather in Paris?”. Confirm the loop runs all three tools in one round.

Hint

Because the loop iterates over every tool_use block in the response, you don’t change the loop at all — you only register the new tool in two places: the tools schema list and the dispatch map. Watch for three [ran ...] lines before the final answer.

Exercise 3: Port one tool to the runner

Take just the calculator function, decorate it with @beta_tool, and run it through client.beta.messages.tool_runner with the question from Exercise 1. Confirm you get the same numeric answer with no dispatcher and no manual loop.

Hint

Remember the runner reads the schema from your function: give calculator a type hint (expression: str) and a docstring describing what it does. Pass tools=[calculator], then call runner.until_done() and read final.content[-1].text.


Summary

A single round-trip only handles questions that need exactly one tool, once. Wrapping the run-and-return step in a loop lets the model use as many tools, across as many rounds, as a question requires. A dispatcher dictionary maps each tool name to its Python function, and **block.input unpacks the model’s arguments into the call. Each round you append the assistant turn, run every requested tool, and send all the tool_result blocks back as one user turn — looping while stop_reason == "tool_use" and stopping at "end_turn". The SDK tool runner drives that exact loop for you when you don’t need to inspect or guard each call.

Key Concepts

  • The tool-use loop — repeating run-and-return while stop_reason == "tool_use", stopping at "end_turn".
  • Dispatcher — a {name: function} dictionary that turns the model’s tool name into a real call.
  • Argument unpackingdispatch[name](**block.input) maps the model’s input onto your function’s parameters.
  • One results turn per round — gather all tool_result blocks from one response into a single user message.
  • Tool runner — the SDK’s @beta_tool + tool_runner(...).until_done() helper that runs the loop for you.

Why This Matters

Every agent you build from here on runs this loop at its core: a coding assistant that reads files then edits them, a research bot that searches then summarizes, a data tool that queries then charts. Whether you write the loop by hand for full control or lean on the runner for speed, mastering this cycle is what turns a model that requests actions into a system that completes them.


Next Steps

Continue to Lesson 4 - Parallel Tools, Errors, and Strict Schemas

Run independent tools in parallel, return errors the model can recover from, and tighten schemas so calls stay valid.

Back to Module Overview

Return to the Tool Use & Function Calling module overview


Continue Building Your Skills

You now have a loop that handles whatever a model throws at it — one tool or many, one round or several. Next you’ll make that loop robust: running independent tools at the same time for speed, handing errors back so the model can recover instead of crashing, and locking down schemas so the arguments arrive valid every time.