Lesson 3 - Building the Agent Loop

Welcome to Building the Agent Loop

In Lesson 2 you closed the loop on a single tool call: Claude asked for one tool, you ran it, fed the tool_result back, and Claude gave its final answer. But a real agent rarely needs exactly one tool, exactly once. It might need two tools to answer one question, or a tool whose result prompts another tool call, or no tools at all. You can’t hard-code “call once, then answer” — you need a structure that keeps going as long as Claude keeps asking, and stops the moment it doesn’t.

That structure is the agent loop, and it’s the heart of this whole module. The good news: it’s the exact pattern from Lesson 2, wrapped in a while-style loop with a stop condition. In this lesson you’ll write it once as a reusable run_agent function, walk it line by line, and watch it handle a request that needs two tools before it can answer.

By the end of this lesson, you will be able to:

Write a general run_agent function that drives Claude through any number of tool calls
Explain why the messages list is the agent’s memory across iterations
Use a max_steps guard so the loop can’t run forever
Read a run’s trace: how many model calls happened, and what’s in the transcript

This is where the pieces from Lessons 1 and 2 become a working agent. Let’s build it.

The Loop

Here’s the whole idea in one sentence: call the model, append its turn, and while the stop reason is tool_use, run the tools and feed the results back — until a final answer, bounded by a max-steps guard. That’s the loop. The figure below is the same idea as a flowchart, and then we’ll look at the code.

The agent loop in code. Start: messages = [user msg]. Then a cycle: call response = messages.create(model, tools, messages); append the assistant response.content; a decision diamond 'stop_reason == tool_use?'. If No, return the final answer (the loop is done). If Yes, run each tool_use block and collect tool_result blocks, append a user message with ALL tool_results, and loop back to the model call. A note says a max_steps guard stops a runaway agent that never returns end_turn. — The agent loop: call the model, append its turn, and while the stop reason is tool_use, run the tools and feed results back — until a final answer, bounded by a max-steps guard.

And here is that flowchart as real Python. This is the exact function we ran and verified — read it once top to bottom, then we’ll take it apart piece by piece:

def run_agent(client, user_message, *, system, tools, tool_functions,
              model="claude-haiku-4-5", max_steps=8):
    messages = [{"role": "user", "content": user_message}]
    for step in range(1, max_steps + 1):
        response = client.messages.create(
            model=model, max_tokens=1024, system=system, tools=tools, messages=messages)
        messages.append({"role": "assistant", "content": response.content})

        if response.stop_reason != "tool_use":
            final = "".join(b.text for b in response.content if b.type == "text")
            return {"answer": final, "steps": step, "messages": messages}

        tool_results = []
        for block in response.content:
            if block.type != "tool_use":
                continue
            fn = tool_functions.get(block.name)
            try:
                if fn is None:
                    raise KeyError(f"no such tool: {block.name}")
                result = fn(**block.input)
                tool_results.append({"type": "tool_result", "tool_use_id": block.id,
                                     "content": str(result)})
            except Exception as exc:
                tool_results.append({"type": "tool_result", "tool_use_id": block.id,
                                     "content": f"Error: {exc}", "is_error": True})
        messages.append({"role": "user", "content": tool_results})
    return {"answer": "Stopped: reached the step limit.", "steps": max_steps, "messages": messages}

Notice what the function takes in: a client, the user_message, and then — as keyword-only arguments — the system prompt, the tools definitions (the dictionaries from Lesson 1), and tool_functions, a dictionary mapping each tool’s name to the real Python function that runs it. That last one is the bridge between Claude’s request to call a tool and your code actually running it.

Walking It Line by Line

The function is short, but every line is load-bearing. Let’s go through it in the order it executes.

The messages list is the memory. The first line builds it:

messages = [{"role": "user", "content": user_message}]

This list is the entire conversation, and it grows with every iteration. Claude is stateless — it remembers nothing between calls — so on each loop pass you send the whole messages list again. That’s why this list, not some hidden state, is the agent’s memory: everything Claude knows about this run lives here, in order.

The for loop bounds the run. Instead of while True, the loop counts:

for step in range(1, max_steps + 1):

Each pass is one model call. If you used while True and the model — through a bug, a confusing prompt, or a tool that always asks for more — never stopped requesting tools, your agent would loop forever, burning tokens and money. max_steps is the seatbelt: at most max_steps model calls, then the loop ends no matter what.

Call the model, then append its turn. Two lines, always together:

response = client.messages.create(
    model=model, max_tokens=1024, system=system, tools=tools, messages=messages)
messages.append({"role": "assistant", "content": response.content})

You append response.content — the assistant’s full turn, including any tool_use blocks — before you do anything else with it. This is essential: if you later send back a tool_result, the API needs the matching tool_use to already be in the transcript, or it rejects the request. Append first, process second.

Check the stop reason: is the agent done? This is the loop’s exit:

if response.stop_reason != "tool_use":
    final = "".join(b.text for b in response.content if b.type == "text")
    return {"answer": final, "steps": step, "messages": messages}

If the stop reason is anything other than "tool_use" — usually "end_turn" — Claude is finished and answered in prose. You stitch together the text blocks into the final answer and return. Note the framing: we don’t check “is it end_turn?”, we check “is it not tool_use?” Any non-tool stop reason means there’s no tool to run, so the loop is done.

Otherwise, run every requested tool. If we’re past that if, the stop reason was tool_use, so there’s at least one tool to run — possibly several in one turn:

tool_results = []
for block in response.content:
    if block.type != "tool_use":
        continue
    fn = tool_functions.get(block.name)
    try:
        if fn is None:
            raise KeyError(f"no such tool: {block.name}")
        result = fn(**block.input)
        tool_results.append({"type": "tool_result", "tool_use_id": block.id,
                             "content": str(result)})
    except Exception as exc:
        tool_results.append({"type": "tool_result", "tool_use_id": block.id,
                             "content": f"Error: {exc}", "is_error": True})

For each tool_use block, you look up the matching Python function by name, call it with the arguments Claude chose (fn(**block.input)), and collect a tool_result that echoes the same block.id. That id is how the API pairs your result with Claude’s request. The try/except means a tool that fails doesn’t crash your agent — the error comes back as a tool_result so Claude can react to it. (We’ll go deep on that error handling and on multiple tools in one turn — parallel tool use — in the next lesson; for now, just notice that each tool_use block produces exactly one tool_result.)

Feed all results back as one user message. The last line of the loop body:

messages.append({"role": "user", "content": tool_results})

Every tool_result from this turn goes into a single user message. Then the for loop comes around again, calls the model with the now-longer messages list, and Claude continues — armed with the tool outputs you just provided. When the model has everything it needs, it returns a non-tool_use stop reason and the if above returns the answer.

The fallback return. If the loop runs all the way through max_steps without ever hitting a final answer, control falls out of the for loop to the last line, which returns a polite “reached the step limit” message instead of looping forever. In a healthy run you never reach this line.

Running It

Enough reading — let’s watch it work. Suppose the user asks something that needs two tools: a get_weather tool and a convert_currency tool (the same two you sketched across Lessons 1 and 2). Claude returns both tool_use blocks in a single turn, your loop runs both, feeds both results back, and Claude answers. Here is the real output from running run_agent against that request (the prints are the loop’s own):

  step 1: Claude calls get_weather({'city': 'Kyoto'})
  step 1: Claude calls convert_currency({'amount': 20000, 'from_currency': 'JPY', 'to_currency': 'USD'})
  final answer: Kyoto is 16°C, clear and crisp, and 20000 JPY is about 134.0 USD.
  steps taken: 2
  messages in transcript: 4

Read the trace carefully, because it tells you exactly how the loop behaved:

Step 1 — both tool calls happen here. Claude’s first turn requested two tools at once, so your code ran both get_weather and convert_currency and collected two tool_results into one user message.
The final answer arrives on step 2. With both tool results now in messages, Claude had everything it needed and returned a prose answer — a non-tool_use stop reason — so the loop returned.
steps taken: 2 means the loop made 2 model calls: one to get the tool requests, one to get the final answer. Running two tools did not cost two extra model calls — they came back together in step 1’s single turn.
messages in transcript: 4 is the conversation that accumulated: user (the question) → assistant (two tool_use blocks) → user (two tool_result blocks) → assistant (the final text). Four messages, exactly matching the loop’s appends.

That mapping — 2 model calls, 4 messages — is what’s guaranteed by the control flow, regardless of what the model says. The exact wording of “final answer” is illustrative; phrasing varies from run to run because it’s generated text. What never varies is the shape: append the turn, check the stop reason, run tools if asked, feed results back, repeat.

The messages list is the whole memory of the run

There is no hidden agent state anywhere — the messages list is the memory. Every iteration appends to it (the assistant’s turn, then your tool results), and every model call sends the full list back, because Claude itself is stateless between calls. This is why the transcript grew to four messages for a two-tool answer, and it’s why everything you’ll later hear about “agent memory” — trimming old turns, summarizing, persisting across sessions — is really about managing this one list. Master the list and you’ve mastered the loop.

Practice Exercises

Exercise 1: Why a loop, not a single call?

Lesson 2 handled one tool call by calling the model, running the tool, and calling the model once more. Why can’t every agent just do that — call, run, call — instead of looping?

Hint

Because you don’t know in advance how many tool calls a request needs. One question might need zero tools, another two, and another a tool whose result prompts a follow-up tool call. A fixed “call, run, call” only handles exactly one round of tools; the loop handles any number — including zero — because it keeps going as long as the stop reason is tool_use and stops the instant it isn’t.

Exercise 2: What makes the loop stop?

There are two distinct ways run_agent can return. What are they, and which one is the “normal, healthy” exit?

Hint

The healthy exit is the if response.stop_reason != "tool_use": branch — Claude returned a final answer (any non-tool_use stop reason, usually "end_turn"), so the loop returns the text. The other exit is the fallback return after the for loop: the agent hit max_steps model calls without ever finishing, so it returns “reached the step limit.” The first is success; the second is the safety net that stops a runaway agent.

Exercise 3: How many model calls for a two-tool request?

In the verified run, the question needed two tools and the trace showed steps taken: 2. Why is it 2 model calls and not 3 (one per tool plus a final answer)?

Hint

Because Claude requested both tools in a single turn — two tool_use blocks in one response. Your loop ran both and returned both tool_results in one user message, so it only took one model call to request the tools and one more to answer. The number of model calls is the number of times around the loop, not the number of tools — running extra tools in the same turn is free in terms of model calls.

Summary

The agent loop generalizes a single tool call into a function that drives Claude through any number of tool calls. You start with messages = [the user message], then loop: call messages.create, append the assistant’s turn (response.content), and check the stop reason. If it’s not "tool_use", you extract the text and return — the agent is done. If it is "tool_use", you run every requested tool, collect a tool_result (echoing each tool_use id) into one user message, append it, and loop again. A max_steps guard caps the number of model calls so a misbehaving agent can’t run forever. The running messages list is the agent’s memory — it grows each pass and is sent in full every call, because Claude is stateless. In the verified run, a two-tool request took 2 model calls and produced a 4-message transcript.

Key Concepts

run_agent loop — call model → append turn → if not tool_use, return; else run tools, append results, repeat.
messages as memory — the one list holds the whole run; sent in full on every call.
max_steps guard — bounds the loop so it can’t run forever.
Stop-reason check — exit on any non-tool_use stop reason; continue only on tool_use.
Model calls vs. tools — steps = times around the loop, not number of tools (parallel tools share a turn).

Why This Matters

This loop is the agent. Everything else in agent engineering — better tools, memory management, guardrails, multi-step planning — is built on top of this exact control flow. Once you can see how the messages list carries the run and how the stop reason gates the loop, the rest of the field stops feeling like magic: an “agent framework” is mostly this loop plus conveniences. And the max_steps guard is your first taste of a recurring theme — agents need bounds, because a model in a loop will happily keep going. Next you’ll make the loop sturdier: handling tool errors gracefully and running multiple tools in parallel, which is where production agents earn their reliability.

Next Steps

Continue to Lesson 4 - Errors, Parallel Tools, and Stopping

Make the loop production-ready: handle tool failures as results, run multiple tool calls in one turn, and reason about when to stop.

Back to Module Overview

Return to The Agent Loop with Claude module overview

Continue Building Your Skills

You now have a working agent loop — a single function that drives Claude through any number of tool calls until it produces a final answer, bounded by a step limit. Next you’ll harden it: turning tool failures into tool_results the model can recover from, handling several tool calls in a single turn, and thinking carefully about the conditions that should make an agent stop.

Previous lesson

Lesson 2 - Handling tool_use and tool_results

Next lesson

Lesson 4 - Errors, Parallel Tools, and Stopping Conditions

Courses

DATATWEETS

Title here

Lesson 3 - Building the Agent Loop

Welcome to Building the Agent Loop

The Loop

Walking It Line by Line

Running It

Practice Exercises

Exercise 1: Why a loop, not a single call?

Exercise 2: What makes the loop stop?

Exercise 3: How many model calls for a two-tool request?

Summary

Key Concepts

Why This Matters

Next Steps

Continue to Lesson 4 - Errors, Parallel Tools, and Stopping

Back to Module Overview

Continue Building Your Skills

Lesson 3 - Building the Agent Loop

Welcome to Building the Agent Loop#

The Loop#

Walking It Line by Line#

Running It#

Practice Exercises#

Exercise 1: Why a loop, not a single call?#

Exercise 2: What makes the loop stop?#

Exercise 3: How many model calls for a two-tool request?#

Summary#

Key Concepts#

Why This Matters#

Next Steps#

Continue to Lesson 4 - Errors, Parallel Tools, and Stopping

Back to Module Overview

Continue Building Your Skills#

Welcome to Building the Agent Loop

The Loop

Walking It Line by Line

Running It

Practice Exercises

Exercise 1: Why a loop, not a single call?

Exercise 2: What makes the loop stop?

Exercise 3: How many model calls for a two-tool request?

Summary

Key Concepts

Why This Matters

Next Steps

Continue Building Your Skills