Lesson 5 - Guided Project: A Multi-Tool Workflow Agent

Welcome to the Guided Project

You’ll build a small assistant equipped with several real tools and a robust tool-use loop, so it can answer compound questions by chaining tool calls — the foundation of every agent you’ll build later. By the end you’ll have a single file, tool_agent.py, that you run in a terminal and talk to. Ask it “What’s 15% of 240, and what time is it?” and it will reach for two tools in a single round, run them both, and stitch the results into one answer.

Nothing here is brand-new theory. Each step snaps together one idea from earlier in this module: the tool definitions and schemas from Lessons 1 and 2, the request-execute-respond loop from Lesson 3, and the error handling and multiple-tool-per-round details from Lesson 4. The project is where those separate pieces stop being exercises and become one program you understand top to bottom.

By the end of this lesson, you will be able to:

  • Define a small toolbox of real, working tools, each with a clear schema
  • Route a tool request to the function that runs it with a name→function dispatcher
  • Write a loop that handles several tool calls per round and reports tool errors back to the model
  • Tell the assistant, in its system prompt, that it has tools and should use them rather than guess

You only need the Anthropic SDK, your API key in an environment variable, and Lessons 1 through 4. Let’s build it.


Step 1: Define a Small Toolbox of Real Tools

Start a fresh file called tool_agent.py. The first job is to write the tools — actual Python functions that do real work — and a schema for each so the model knows when and how to call them. We’ll give the assistant four: a calculator, a clock, a unit converter, and a tiny knowledge base.

As in every lesson, the client reads your key from the ANTHROPIC_API_KEY environment variable — you never put the key in the code. If you haven’t set it for this terminal session yet:

export ANTHROPIC_API_KEY="your-key-here"
pip install anthropic

Now the top of the file: imports, the client, and the model.

import ast
import operator
from datetime import datetime

import anthropic

# The client reads your key from the ANTHROPIC_API_KEY environment variable.
client = anthropic.Anthropic()

MODEL = "claude-haiku-4-5"

A safe calculator

A language model guesses at arithmetic; a real function computes it. But we have to be careful: Python’s built-in eval would run any code the model sends, which is a security hole. Instead we parse the expression into a syntax tree and evaluate only the arithmetic nodes we explicitly allow.

_OPS = {
    ast.Add: operator.add, ast.Sub: operator.sub,
    ast.Mult: operator.mul, ast.Div: operator.truediv,
    ast.Pow: operator.pow, ast.Mod: operator.mod,
    ast.USub: operator.neg, ast.UAdd: operator.pos,
}


def _eval_node(node):
    if isinstance(node, ast.Constant) and isinstance(node.value, (int, float)):
        return node.value
    if isinstance(node, ast.BinOp) and type(node.op) in _OPS:
        return _OPS[type(node.op)](_eval_node(node.left), _eval_node(node.right))
    if isinstance(node, ast.UnaryOp) and type(node.op) in _OPS:
        return _OPS[type(node.op)](_eval_node(node.operand))
    raise ValueError("unsupported expression")


def calculator(expression):
    """Exactly evaluate a basic arithmetic expression."""
    tree = ast.parse(expression, mode="eval")
    return _eval_node(tree.body)

A quick check that it computes exactly:

print(calculator("15 * 240 * 0.15"))
print(calculator("(2 + 3) ** 2"))
540.0
25

A clock, a unit converter, and a knowledge base

The other three tools are short. current_datetime reads the real system clock; convert_units looks up a factor table; and lookup_fact is a small in-memory dictionary the model can query.

def current_datetime():
    """Return the real current date and time."""
    return datetime.now().strftime("%Y-%m-%d %H:%M:%S")


_UNITS = {
    "km": ("length", 1000.0), "m": ("length", 1.0), "cm": ("length", 0.01),
    "mile": ("length", 1609.344), "ft": ("length", 0.3048),
    "kg": ("mass", 1000.0), "g": ("mass", 1.0), "lb": ("mass", 453.592),
}


def convert_units(value, from_unit, to_unit):
    """Convert a value between two compatible units."""
    if from_unit not in _UNITS or to_unit not in _UNITS:
        raise ValueError("unknown unit")
    src_family, src_factor = _UNITS[from_unit]
    dst_family, dst_factor = _UNITS[to_unit]
    if src_family != dst_family:
        raise ValueError("incompatible units")
    return value * src_factor / dst_factor


_FACTS = {
    "speed of light": "299,792,458 meters per second in a vacuum.",
    "python release year": "Python was first released in 1991 by Guido van Rossum.",
    "boiling point of water": "100 degrees Celsius at sea-level pressure.",
}


def lookup_fact(topic):
    """Look up a stored fact by topic."""
    fact = _FACTS.get(topic.lower())
    if fact is None:
        raise ValueError(f"no fact stored for '{topic}'")
    return fact

Notice that two of these functions raise when given bad input — an unknown unit, a missing fact. That’s deliberate. In Step 3 we’ll catch those errors and hand them back to the model instead of crashing.

The schemas

Each function needs a tool definition — a name, a description, and an input schema — exactly as you wrote them in Lessons 1 and 2. The description is the most important field: it’s how the model decides when to reach for the tool, so be specific.

tools = [
    {
        "name": "calculator",
        "description": "Evaluate a basic arithmetic expression exactly. "
                       "Supports + - * / ** % and parentheses.",
        "input_schema": {
            "type": "object",
            "properties": {"expression": {"type": "string",
                           "description": "e.g. '15 * 0.15' or '(2 + 3) ** 2'"}},
            "required": ["expression"],
        },
    },
    {
        "name": "current_datetime",
        "description": "Get the real current date and time as YYYY-MM-DD HH:MM:SS.",
        "input_schema": {"type": "object", "properties": {}},
    },
    {
        "name": "convert_units",
        "description": "Convert a value between units. Length: km, m, cm, mile, "
                       "ft. Mass: kg, g, lb.",
        "input_schema": {
            "type": "object",
            "properties": {
                "value": {"type": "number"},
                "from_unit": {"type": "string"},
                "to_unit": {"type": "string"},
            },
            "required": ["value", "from_unit", "to_unit"],
        },
    },
    {
        "name": "lookup_fact",
        "description": "Look up a stored fact. Known topics: 'speed of light', "
                       "'python release year', 'boiling point of water'.",
        "input_schema": {
            "type": "object",
            "properties": {"topic": {"type": "string"}},
            "required": ["topic"],
        },
    },
]

current_datetime has no inputs

A tool can take zero arguments. Its input_schema is still a valid object — {"type": "object", "properties": {}} — it just has no properties. When the model calls it, the input will be an empty dictionary {}. Tools that read the world (the clock, a sensor, “today’s headlines”) often need no inputs at all.


Step 2: A Name → Function Dispatcher

The model never runs your code (the control invariant from Lesson 1). When it wants calculator, it sends you a tool_use block carrying the name "calculator" and the input dictionary. Your job is to turn that name into the actual function and call it. A dictionary that maps names to functions — a dispatcher — does exactly this, and it stays clean as you add more tools.

DISPATCH = {
    "calculator": calculator,
    "current_datetime": current_datetime,
    "convert_units": convert_units,
    "lookup_fact": lookup_fact,
}


def run_tool(name, tool_input):
    """Run one tool by name and return (result_text, is_error)."""
    try:
        func = DISPATCH[name]
        result = func(**tool_input)
        return str(result), False
    except Exception as exc:  # report failures back to the model, not crash
        return f"Error: {exc}", True

Two details earn their keep here. First, func(**tool_input) unpacks the model’s input dictionary straight into the function’s arguments — {"value": 5, "from_unit": "km", "to_unit": "mile"} becomes convert_units(value=5, from_unit="km", to_unit="mile"). Because your schema named the properties the same as the parameters, this just works. Second, the try/except means a bad tool call — an unknown unit, a missing fact — comes back as (error_text, True) instead of crashing your program. We’ll use that is_error flag in the next step.

Why a dispatcher beats a big if/elif

You could write if name == "calculator": ... elif name == "current_datetime": .... It works, but every new tool grows the chain. A name → function dictionary scales flat: adding a tool is one new entry, and run_tool never changes. This is the pattern real agents use to manage dozens of tools.


Step 3: The Robust Tool-Use Loop

Here’s the heart of the agent — the loop you first met in Lesson 3, now hardened with the two things Lesson 4 added: many tool calls in a single round, and error reporting via is_error.

The shape is a while True loop. Each pass calls the model. If the model is done (stop_reason is not "tool_use"), we return its text. Otherwise it asked for one or more tools — so we run every tool_use block in the response, collect all the results into one list, send them back together, and loop again.

def answer(question, messages):
    """Run one user turn to completion, looping over tool calls."""
    messages.append({"role": "user", "content": question})

    while True:
        response = client.messages.create(
            model=MODEL, max_tokens=600, system=SYSTEM,
            tools=tools, messages=messages,
        )
        messages.append({"role": "assistant", "content": response.content})

        if response.stop_reason != "tool_use":
            # No tool requested — the model has written its final answer.
            text = "".join(b.text for b in response.content if b.type == "text")
            return text

        # Run every tool the model asked for this round, collect all results.
        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                result_text, is_error = run_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": result_text,
                    "is_error": is_error,
                })
        messages.append({"role": "user", "content": tool_results})

Three things make this loop robust rather than fragile:

  • It iterates over every block. A single response can contain several tool_use blocks. The for loop handles one or twenty without changing.
  • It returns all results in one user message. Every tool_result for a round goes into a single tool_results list, keyed to its request’s id. Splitting them across separate messages confuses the model — Lesson 4’s rule.
  • It passes the error flag through. When a tool raises, run_tool returns is_error=True, and we set "is_error": True on the result. The model sees the error and can recover — try a different unit, ask you to clarify — instead of receiving a silent wrong answer.

Append the assistant turn before the results

Every pass appends response.content (the assistant’s turn, including its tool_use blocks) to messages before appending the tool_results. The API requires that each tool_result follow the matching tool_use in the history. Skip the assistant append and the next request fails with a mismatch. Append both, every round.


Step 4: A System Prompt That Says “Use Your Tools”

The model won’t reach for a tool it doesn’t know it should use. A short system prompt tells it what it has and sets the rule: compute and look things up with tools, don’t guess. It also nudges the assistant toward terminal-friendly, concise answers.

SYSTEM = (
    "You are a helpful command-line assistant with access to tools. "
    "You have a calculator, a clock, a unit converter, and a small "
    "knowledge base. When a question needs exact math, the current "
    "date or time, a unit conversion, or a stored fact, call the "
    "matching tool instead of guessing. A single question may need "
    "more than one tool — call each one you need. Once you have the "
    "results, answer in plain text, concisely, suitable for a terminal."
)

The line that matters most for this project is “A single question may need more than one tool — call each one you need.” That’s the permission the model needs to answer a compound question in one round instead of plodding through it one tool at a time.


Step 5: A REPL with /exit

The last piece is the loop you talk to. A REPL — read, evaluate, print, loop — reads a line, runs the full answer turn (which may make several model calls behind the scenes), prints the reply, and repeats. We give the user one command, /exit, to quit cleanly, and skip blank input so a stray Enter never fires a request.

def main():
    messages = []
    print("Tool agent ready. Ask a question, or /exit to quit.\n")
    while True:
        user_input = input("you> ").strip()
        if not user_input:
            continue
        if user_input == "/exit":
            print("Goodbye.")
            break
        reply = answer(user_input, messages)
        print("agent>", reply, "\n")


if __name__ == "__main__":
    main()

The messages list lives across iterations, so the conversation has memory: the agent remembers earlier turns the same way the assistant in the previous module did. With that, the program is complete — here’s the whole thing.


The Complete tool_agent.py

import ast
import operator
from datetime import datetime

import anthropic

# The client reads your key from the ANTHROPIC_API_KEY environment variable.
client = anthropic.Anthropic()

MODEL = "claude-haiku-4-5"

SYSTEM = (
    "You are a helpful command-line assistant with access to tools. "
    "You have a calculator, a clock, a unit converter, and a small "
    "knowledge base. When a question needs exact math, the current "
    "date or time, a unit conversion, or a stored fact, call the "
    "matching tool instead of guessing. A single question may need "
    "more than one tool — call each one you need. Once you have the "
    "results, answer in plain text, concisely, suitable for a terminal."
)

# ---------------------------------------------------------------------------
# Step 1: the toolbox — four real, working tools.
# ---------------------------------------------------------------------------

# A safe calculator: parse the expression into an AST and evaluate only the
# arithmetic nodes we allow. eval() would run arbitrary code — this won't.
_OPS = {
    ast.Add: operator.add, ast.Sub: operator.sub,
    ast.Mult: operator.mul, ast.Div: operator.truediv,
    ast.Pow: operator.pow, ast.Mod: operator.mod,
    ast.USub: operator.neg, ast.UAdd: operator.pos,
}


def _eval_node(node):
    if isinstance(node, ast.Constant) and isinstance(node.value, (int, float)):
        return node.value
    if isinstance(node, ast.BinOp) and type(node.op) in _OPS:
        return _OPS[type(node.op)](_eval_node(node.left), _eval_node(node.right))
    if isinstance(node, ast.UnaryOp) and type(node.op) in _OPS:
        return _OPS[type(node.op)](_eval_node(node.operand))
    raise ValueError("unsupported expression")


def calculator(expression):
    """Exactly evaluate a basic arithmetic expression."""
    tree = ast.parse(expression, mode="eval")
    return _eval_node(tree.body)


def current_datetime():
    """Return the real current date and time."""
    return datetime.now().strftime("%Y-%m-%d %H:%M:%S")


# A tiny conversion table: factor to a base unit per measurement family.
_UNITS = {
    "km": ("length", 1000.0), "m": ("length", 1.0), "cm": ("length", 0.01),
    "mile": ("length", 1609.344), "ft": ("length", 0.3048),
    "kg": ("mass", 1000.0), "g": ("mass", 1.0), "lb": ("mass", 453.592),
}


def convert_units(value, from_unit, to_unit):
    """Convert a value between two compatible units."""
    if from_unit not in _UNITS or to_unit not in _UNITS:
        raise ValueError("unknown unit")
    src_family, src_factor = _UNITS[from_unit]
    dst_family, dst_factor = _UNITS[to_unit]
    if src_family != dst_family:
        raise ValueError("incompatible units")
    return value * src_factor / dst_factor


# A small in-memory "knowledge base" the model can look facts up in.
_FACTS = {
    "speed of light": "299,792,458 meters per second in a vacuum.",
    "python release year": "Python was first released in 1991 by Guido van Rossum.",
    "boiling point of water": "100 degrees Celsius at sea-level pressure.",
}


def lookup_fact(topic):
    """Look up a stored fact by topic."""
    fact = _FACTS.get(topic.lower())
    if fact is None:
        raise ValueError(f"no fact stored for '{topic}'")
    return fact


# Each tool gets a clear schema so the model knows when and how to call it.
tools = [
    {
        "name": "calculator",
        "description": "Evaluate a basic arithmetic expression exactly. "
                       "Supports + - * / ** % and parentheses.",
        "input_schema": {
            "type": "object",
            "properties": {"expression": {"type": "string",
                           "description": "e.g. '15 * 0.15' or '(2 + 3) ** 2'"}},
            "required": ["expression"],
        },
    },
    {
        "name": "current_datetime",
        "description": "Get the real current date and time as YYYY-MM-DD HH:MM:SS.",
        "input_schema": {"type": "object", "properties": {}},
    },
    {
        "name": "convert_units",
        "description": "Convert a value between units. Length: km, m, cm, mile, "
                       "ft. Mass: kg, g, lb.",
        "input_schema": {
            "type": "object",
            "properties": {
                "value": {"type": "number"},
                "from_unit": {"type": "string"},
                "to_unit": {"type": "string"},
            },
            "required": ["value", "from_unit", "to_unit"],
        },
    },
    {
        "name": "lookup_fact",
        "description": "Look up a stored fact. Known topics: 'speed of light', "
                       "'python release year', 'boiling point of water'.",
        "input_schema": {
            "type": "object",
            "properties": {"topic": {"type": "string"}},
            "required": ["topic"],
        },
    },
]

# ---------------------------------------------------------------------------
# Step 2: the dispatcher — map a tool name to the function that runs it.
# ---------------------------------------------------------------------------
DISPATCH = {
    "calculator": calculator,
    "current_datetime": current_datetime,
    "convert_units": convert_units,
    "lookup_fact": lookup_fact,
}


def run_tool(name, tool_input):
    """Run one tool by name and return (result_text, is_error)."""
    try:
        func = DISPATCH[name]
        result = func(**tool_input)
        return str(result), False
    except Exception as exc:  # report failures back to the model, not crash
        return f"Error: {exc}", True


# ---------------------------------------------------------------------------
# Step 3: the robust loop — handle many tool_use blocks per round and errors.
# ---------------------------------------------------------------------------
def answer(question, messages):
    """Run one user turn to completion, looping over tool calls."""
    messages.append({"role": "user", "content": question})

    while True:
        response = client.messages.create(
            model=MODEL, max_tokens=600, system=SYSTEM,
            tools=tools, messages=messages,
        )
        messages.append({"role": "assistant", "content": response.content})

        if response.stop_reason != "tool_use":
            # No tool requested — the model has written its final answer.
            text = "".join(b.text for b in response.content if b.type == "text")
            return text

        # Run every tool the model asked for this round, collect all results.
        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                result_text, is_error = run_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": result_text,
                    "is_error": is_error,
                })
        messages.append({"role": "user", "content": tool_results})


# ---------------------------------------------------------------------------
# Step 5: the REPL.
# ---------------------------------------------------------------------------
def main():
    messages = []
    print("Tool agent ready. Ask a question, or /exit to quit.\n")
    while True:
        user_input = input("you> ").strip()
        if not user_input:
            continue
        if user_input == "/exit":
            print("Goodbye.")
            break
        reply = answer(user_input, messages)
        print("agent>", reply, "\n")


if __name__ == "__main__":
    main()

Run it with python tool_agent.py and ask a compound question. Here is a real session — your text after you>, the agent’s reply after agent>:

Tool agent ready. Ask a question, or /exit to quit.

you> What's 15% of 240, and what time is it?
agent> 15% of 240 is **36**.

The current time is **19:57:43 on June 27, 2026**.

you> Convert 5 km to miles.
agent> 5 km is approximately **3.11 miles**.

you> /exit
Goodbye.

That first answer is the whole point of the project. The single question “What’s 15% of 240, and what time is it?” needs two unrelated capabilities — exact math and the real clock — and the model recognized that, asking for both tools in one round. If you print the tool calls inside the loop, you see exactly that:

you> What's 15% of 240, and what time is it?
   [model requested 2 tool(s) this round: calculator, current_datetime]
     -> calculator({'expression': '240 * 0.15'}) = 36.0
     -> current_datetime({}) = 2026-06-27 19:57:23
agent> 15% of 240 is **36**.

The current time is **19:57:23** on **2026-06-27** (June 27, 2026).

Two tool_use blocks, two results returned together, one fluent answer. That is tool chaining — and it’s the same mechanism that lets a real agent read a file, search a database, and call an API all in service of one request. You just built it from parts you wrote yourself.


Take It Further

The agent works; now make it yours. Each of these is a small change to the file you just wrote:

  • Add a tool. Write a random_choice(options) tool that picks from a list, or a word_count(text) tool. Add the function, register it in DISPATCH, and append its schema to tools — three small edits, and the agent can use it. Watch how the dispatcher pattern makes growth painless.
  • Grow the knowledge base. Add entries to _FACTS, or load it from a JSON file at startup so the “knowledge base” lives outside the code. Ask the agent something only the file knows.
  • Add a /tools command. Alongside /exit, handle a /tools command in the REPL that prints the name and description of every tool without calling the model — a quick way for the user to see what the agent can do.
  • Trigger an error on purpose. Ask the agent to “convert 5 kg to miles.” The convert_units tool raises incompatible units, your loop returns it with is_error=True, and the model sees the failure and explains it instead of inventing a number. That graceful recovery is the payoff of the error-handling you built in Step 3.

Summary

You assembled a complete multi-tool agent from the parts you built across this module. A toolbox of four real functions — each with a clear schema — gave the assistant genuine capabilities; a name→function dispatcher turned tool requests into function calls; a robust loop ran every tool the model asked for in a round and reported errors back with is_error; a tool-aware system prompt told the model to use its tools rather than guess; and a REPL with /exit tied it all into a program you can talk to. The proof is a single compound question driving more than one tool in one round — the essence of an agent.

Key Concepts

  • Toolbox — a set of real functions plus their tool definitions, offered to the model together.
  • Dispatcher — a name → function map that routes a tool request to the code that runs it.
  • Robust tool-use loop — a loop that handles multiple tool_use blocks per round, returns all tool_results in one message, and passes is_error through on failure.
  • Tool-aware system prompt — instructions that tell the model what tools it has and to use them instead of guessing.
  • Tool chaining — answering one compound question by calling several tools, the foundation of agent behavior.

Why This Matters

Every agent you’ll build from here — a research assistant, a coding agent, a workflow bot — is this skeleton with bigger tools bolted on: real APIs, a database, a file system, a search index. The loop that runs many tools per round and recovers from errors is the same; only the toolbox grows. Having written it once, by hand, you now understand what those larger systems are doing underneath, and you can extend this one a tool at a time.


Next Steps

Continue to Module 4 - Model Context Protocol (MCP) (next in the course)

Move from hand-wiring tools to a standard protocol that lets your agent discover and use tools from any compatible server.

Back to Module Overview

Return to the Tool Use & Function Calling module overview


Continue Building Your Skills

You’ve built a real agent — a working assistant you wrote line by line, that picks the right tools, runs several at once, recovers from errors, and chains them into a single answer. That’s a genuine milestone: everything in this module now lives in one program you understand end to end. Next you’ll see how the Model Context Protocol turns the tool wiring you did by hand into a standard any server can speak — so your agents can plug into tools they’ve never seen before. Onward.