Lesson 3 - Building the Agent Loop
Welcome to Building the Agent Loop
In Lesson 2 you closed the loop on a single tool call: Claude asked for one tool, you ran it, fed the tool_result back, and Claude gave its final answer. But a real agent rarely needs exactly one tool, exactly once. It might need two tools to answer one question, or a tool whose result prompts another tool call, or no tools at all. You can’t hard-code “call once, then answer” — you need a structure that keeps going as long as Claude keeps asking, and stops the moment it doesn’t.
That structure is the agent loop, and it’s the heart of this whole module. The good news: it’s the exact pattern from Lesson 2, wrapped in a while-style loop with a stop condition. In this lesson you’ll write it once as a reusable run_agent function, walk it line by line, and watch it handle a request that needs two tools before it can answer.
By the end of this lesson, you will be able to:
- Write a general
run_agentfunction that drives Claude through any number of tool calls - Explain why the
messageslist is the agent’s memory across iterations - Use a
max_stepsguard so the loop can’t run forever - Read a run’s trace: how many model calls happened, and what’s in the transcript
This is where the pieces from Lessons 1 and 2 become a working agent. Let’s build it.
The Loop
Here’s the whole idea in one sentence: call the model, append its turn, and while the stop reason is tool_use, run the tools and feed the results back — until a final answer, bounded by a max-steps guard. That’s the loop. The figure below is the same idea as a flowchart, and then we’ll look at the code.
And here is that flowchart as real Python. This is the exact function we ran and verified — read it once top to bottom, then we’ll take it apart piece by piece:
def run_agent(client, user_message, *, system, tools, tool_functions,
model="claude-haiku-4-5", max_steps=8):
messages = [{"role": "user", "content": user_message}]
for step in range(1, max_steps + 1):
response = client.messages.create(
model=model, max_tokens=1024, system=system, tools=tools, messages=messages)
messages.append({"role": "assistant", "content": response.content})
if response.stop_reason != "tool_use":
final = "".join(b.text for b in response.content if b.type == "text")
return {"answer": final, "steps": step, "messages": messages}
tool_results = []
for block in response.content:
if block.type != "tool_use":
continue
fn = tool_functions.get(block.name)
try:
if fn is None:
raise KeyError(f"no such tool: {block.name}")
result = fn(**block.input)
tool_results.append({"type": "tool_result", "tool_use_id": block.id,
"content": str(result)})
except Exception as exc:
tool_results.append({"type": "tool_result", "tool_use_id": block.id,
"content": f"Error: {exc}", "is_error": True})
messages.append({"role": "user", "content": tool_results})
return {"answer": "Stopped: reached the step limit.", "steps": max_steps, "messages": messages}Notice what the function takes in: a client, the user_message, and then — as keyword-only arguments — the system prompt, the tools definitions (the dictionaries from Lesson 1), and tool_functions, a dictionary mapping each tool’s name to the real Python function that runs it. That last one is the bridge between Claude’s request to call a tool and your code actually running it.
Walking It Line by Line
The function is short, but every line is load-bearing. Let’s go through it in the order it executes.
The messages list is the memory. The first line builds it:
messages = [{"role": "user", "content": user_message}]This list is the entire conversation, and it grows with every iteration. Claude is stateless — it remembers nothing between calls — so on each loop pass you send the whole messages list again. That’s why this list, not some hidden state, is the agent’s memory: everything Claude knows about this run lives here, in order.
The for loop bounds the run. Instead of while True, the loop counts:
for step in range(1, max_steps + 1):Each pass is one model call. If you used while True and the model — through a bug, a confusing prompt, or a tool that always asks for more — never stopped requesting tools, your agent would loop forever, burning tokens and money. max_steps is the seatbelt: at most max_steps model calls, then the loop ends no matter what.
Call the model, then append its turn. Two lines, always together:
response = client.messages.create(
model=model, max_tokens=1024, system=system, tools=tools, messages=messages)
messages.append({"role": "assistant", "content": response.content})You append response.content — the assistant’s full turn, including any tool_use blocks — before you do anything else with it. This is essential: if you later send back a tool_result, the API needs the matching tool_use to already be in the transcript, or it rejects the request. Append first, process second.
Check the stop reason: is the agent done? This is the loop’s exit:
if response.stop_reason != "tool_use":
final = "".join(b.text for b in response.content if b.type == "text")
return {"answer": final, "steps": step, "messages": messages}If the stop reason is anything other than "tool_use" — usually "end_turn" — Claude is finished and answered in prose. You stitch together the text blocks into the final answer and return. Note the framing: we don’t check “is it end_turn?”, we check “is it not tool_use?” Any non-tool stop reason means there’s no tool to run, so the loop is done.
Otherwise, run every requested tool. If we’re past that if, the stop reason was tool_use, so there’s at least one tool to run — possibly several in one turn:
tool_results = []
for block in response.content:
if block.type != "tool_use":
continue
fn = tool_functions.get(block.name)
try:
if fn is None:
raise KeyError(f"no such tool: {block.name}")
result = fn(**block.input)
tool_results.append({"type": "tool_result", "tool_use_id": block.id,
"content": str(result)})
except Exception as exc:
tool_results.append({"type": "tool_result", "tool_use_id": block.id,
"content": f"Error: {exc}", "is_error": True})For each tool_use block, you look up the matching Python function by name, call it with the arguments Claude chose (fn(**block.input)), and collect a tool_result that echoes the same block.id. That id is how the API pairs your result with Claude’s request. The try/except means a tool that fails doesn’t crash your agent — the error comes back as a tool_result so Claude can react to it. (We’ll go deep on that error handling and on multiple tools in one turn — parallel tool use — in the next lesson; for now, just notice that each tool_use block produces exactly one tool_result.)
Feed all results back as one user message. The last line of the loop body:
messages.append({"role": "user", "content": tool_results})Every tool_result from this turn goes into a single user message. Then the for loop comes around again, calls the model with the now-longer messages list, and Claude continues — armed with the tool outputs you just provided. When the model has everything it needs, it returns a non-tool_use stop reason and the if above returns the answer.
The fallback return. If the loop runs all the way through max_steps without ever hitting a final answer, control falls out of the for loop to the last line, which returns a polite “reached the step limit” message instead of looping forever. In a healthy run you never reach this line.
Running It
Enough reading — let’s watch it work. Suppose the user asks something that needs two tools: a get_weather tool and a convert_currency tool (the same two you sketched across Lessons 1 and 2). Claude returns both tool_use blocks in a single turn, your loop runs both, feeds both results back, and Claude answers. Here is the real output from running run_agent against that request (the prints are the loop’s own):
step 1: Claude calls get_weather({'city': 'Kyoto'})
step 1: Claude calls convert_currency({'amount': 20000, 'from_currency': 'JPY', 'to_currency': 'USD'})
final answer: Kyoto is 16°C, clear and crisp, and 20000 JPY is about 134.0 USD.
steps taken: 2
messages in transcript: 4Read the trace carefully, because it tells you exactly how the loop behaved:
- Step 1 — both tool calls happen here. Claude’s first turn requested two tools at once, so your code ran both
get_weatherandconvert_currencyand collected twotool_results into one user message. - The final answer arrives on step 2. With both tool results now in
messages, Claude had everything it needed and returned a prose answer — a non-tool_usestop reason — so the loop returned. steps taken: 2means the loop made 2 model calls: one to get the tool requests, one to get the final answer. Running two tools did not cost two extra model calls — they came back together in step 1’s single turn.messages in transcript: 4is the conversation that accumulated: user (the question) → assistant (twotool_useblocks) → user (twotool_resultblocks) → assistant (the final text). Four messages, exactly matching the loop’s appends.
That mapping — 2 model calls, 4 messages — is what’s guaranteed by the control flow, regardless of what the model says. The exact wording of “final answer” is illustrative; phrasing varies from run to run because it’s generated text. What never varies is the shape: append the turn, check the stop reason, run tools if asked, feed results back, repeat.
The messages list is the whole memory of the run
There is no hidden agent state anywhere — the messages list is the memory. Every iteration appends to it (the assistant’s turn, then your tool results), and every model call sends the full list back, because Claude itself is stateless between calls. This is why the transcript grew to four messages for a two-tool answer, and it’s why everything you’ll later hear about “agent memory” — trimming old turns, summarizing, persisting across sessions — is really about managing this one list. Master the list and you’ve mastered the loop.
Practice Exercises
Exercise 1: Why a loop, not a single call?
Lesson 2 handled one tool call by calling the model, running the tool, and calling the model once more. Why can’t every agent just do that — call, run, call — instead of looping?
Hint
Because you don’t know in advance how many tool calls a request needs. One question might need zero tools, another two, and another a tool whose result prompts a follow-up tool call. A fixed “call, run, call” only handles exactly one round of tools; the loop handles any number — including zero — because it keeps going as long as the stop reason is tool_use and stops the instant it isn’t.
Exercise 2: What makes the loop stop?
There are two distinct ways run_agent can return. What are they, and which one is the “normal, healthy” exit?
Hint
The healthy exit is the if response.stop_reason != "tool_use": branch — Claude returned a final answer (any non-tool_use stop reason, usually "end_turn"), so the loop returns the text. The other exit is the fallback return after the for loop: the agent hit max_steps model calls without ever finishing, so it returns “reached the step limit.” The first is success; the second is the safety net that stops a runaway agent.
Exercise 3: How many model calls for a two-tool request?
In the verified run, the question needed two tools and the trace showed steps taken: 2. Why is it 2 model calls and not 3 (one per tool plus a final answer)?
Hint
Because Claude requested both tools in a single turn — two tool_use blocks in one response. Your loop ran both and returned both tool_results in one user message, so it only took one model call to request the tools and one more to answer. The number of model calls is the number of times around the loop, not the number of tools — running extra tools in the same turn is free in terms of model calls.
Summary
The agent loop generalizes a single tool call into a function that drives Claude through any number of tool calls. You start with messages = [the user message], then loop: call messages.create, append the assistant’s turn (response.content), and check the stop reason. If it’s not "tool_use", you extract the text and return — the agent is done. If it is "tool_use", you run every requested tool, collect a tool_result (echoing each tool_use id) into one user message, append it, and loop again. A max_steps guard caps the number of model calls so a misbehaving agent can’t run forever. The running messages list is the agent’s memory — it grows each pass and is sent in full every call, because Claude is stateless. In the verified run, a two-tool request took 2 model calls and produced a 4-message transcript.
Key Concepts
run_agentloop — call model → append turn → if nottool_use, return; else run tools, append results, repeat.messagesas memory — the one list holds the whole run; sent in full on every call.max_stepsguard — bounds the loop so it can’t run forever.- Stop-reason check — exit on any non-
tool_usestop reason; continue only ontool_use. - Model calls vs. tools — steps = times around the loop, not number of tools (parallel tools share a turn).
Why This Matters
This loop is the agent. Everything else in agent engineering — better tools, memory management, guardrails, multi-step planning — is built on top of this exact control flow. Once you can see how the messages list carries the run and how the stop reason gates the loop, the rest of the field stops feeling like magic: an “agent framework” is mostly this loop plus conveniences. And the max_steps guard is your first taste of a recurring theme — agents need bounds, because a model in a loop will happily keep going. Next you’ll make the loop sturdier: handling tool errors gracefully and running multiple tools in parallel, which is where production agents earn their reliability.
Next Steps
Continue to Lesson 4 - Errors, Parallel Tools, and Stopping
Make the loop production-ready: handle tool failures as results, run multiple tool calls in one turn, and reason about when to stop.
Back to Module Overview
Return to The Agent Loop with Claude module overview
Continue Building Your Skills
You now have a working agent loop — a single function that drives Claude through any number of tool calls until it produces a final answer, bounded by a step limit. Next you’ll harden it: turning tool failures into tool_results the model can recover from, handling several tool calls in a single turn, and thinking carefully about the conditions that should make an agent stop.