Lesson 4 - Tool Design Patterns

Welcome to Tool Design Patterns

Lesson 1 made a single tool good: a clear name, a description that says when to use it, a tight schema, and a clean return value. But a real agent has several tools, and the way they fit together matters as much as any one of them. A tool set that looks fine in isolation can still confuse Claude — two tools that overlap, an action that gets fired twice because the model retried it, a result so large the model loses the thread. This lesson is about the design judgment that turns a pile of decent tools into a tool set Claude can use confidently and your code can run safely.

By the end of this lesson, you will be able to:

  • Right-size tool granularity — a few well-scoped tools instead of one giant tool or many overlapping ones
  • Reason about which tools are safe to run in parallel or retry
  • Gate side-effecting and irreversible actions behind a guardrail
  • Shape concise, consistent return values, including for large result sets

This builds directly on the loop from Module 2 and the principle that the model proposes, your code disposes from Module 1. Let’s begin.


Granularity: Right-Sizing Your Tools

The first judgment call is how to carve up the work. There’s a sweet spot between two failure modes.

On one end is the do-everything tool. Imagine a single manage_trip tool that searches destinations, fetches weather, and builds an itinerary depending on a mode argument. It looks tidy — one tool! — but it pushes all the branching logic onto Claude through a confusing schema, and the description can’t honestly say “use this when…” because it does three unrelated things.

# Too coarse — one tool wearing three hats
manage_trip(mode="search" | "weather" | "itinerary", ...)

On the other end is tool sprawl: a dozen near-identical tools where Claude can’t tell which to pick. find_city, search_places, lookup_destination, get_location — these overlap, and overlapping tools are poison for tool selection. When two descriptions could both plausibly apply, the model has to guess, and guesses go wrong.

The sweet spot is a few tools that each do one clear thing, with no overlap:

# Right-sized — each tool has one job and a clean "use this when"
search_destinations(query)      # find places matching an interest
get_weather(city)               # current conditions for one city
build_itinerary(city, days)     # day-by-day plan for a chosen city

These are Atlas’s three core tools. Each has a verb-noun name, a non-overlapping purpose, and a description that can clearly state when it applies. A good test: if you can’t write a one-sentence “use this when…” clause that doesn’t collide with another tool’s clause, your granularity is off — either you’re splitting one job across tools or cramming several jobs into one.


Parallel Tools and Side Effects

Once you have a clean set, the next question is how the loop runs them — and that depends on what each tool does to the world.

Tools Claude can run in parallel

Claude can request several tool calls in a single turn when they’re independent. Ask Atlas for the weather in Kyoto and Lisbon, and the model may return two get_weather calls at once. Your loop (from Module 2) executes both and returns both results in a single user message:

# Claude's turn may contain multiple independent tool_use blocks
[ get_weather(city="Kyoto"), get_weather(city="Lisbon") ]
# Your loop runs them, then returns both tool_result blocks in one user message

This is faster and entirely safe — as long as the tools are independent. get_weather for two cities doesn’t depend on order or shared state, so running them together is fine. Design your read-only tools to be safely parallelizable: no shared mutable state, no ordering assumptions.

When a step must happen before another — or when concurrent calls would collide — you can set disable_parallel_tool_use so Claude issues at most one tool call per turn and waits for each result. Reach for it when later calls genuinely depend on earlier ones, not as a blanket default; turning off parallelism everywhere just makes the agent slower.

Side effects and idempotency

What makes a tool safe to parallelize is the same property that makes it safe to retry: it doesn’t change the world. This is the dividing line that should shape your whole tool set.

  • Read-only toolssearch_destinations, get_weather, build_itinerary — only fetch or compute. Running them twice, or in parallel, costs nothing but a little time. Retry freely.
  • Side-effecting tools — a hypothetical book_trip, or send_email, charge_card, delete_account — change something. Run one twice and you’ve double-booked, double-charged, or sent two emails. These are not safe to retry blindly, and usually not safe to parallelize.

Two defenses help here. First, make side-effecting tools idempotent where you can: accept a client-supplied request_id so a repeat call with the same id is recognized and ignored rather than executed again. Second — and this is the one you can never skip — gate irreversible actions behind a confirmation or dry-run, so the model can propose the booking but your code decides whether to actually fire it.

Never let the agent fire an irreversible action unguarded

Tools that send, book, pay, or delete change the real world, and some of those changes can’t be taken back. Never wire one straight to the model’s output. Put a guardrail in front: require an explicit confirmation step, run a dry-run that returns “here’s what would happen,” or have a human approve. Make the action idempotent so an accidental retry is harmless. This is the model proposes, your code disposes in its highest-stakes form — Claude can suggest booking the trip, but your code, not the model, decides whether the charge actually goes through.


Return Shapes: Concise, Structured, Consistent

Lesson 1 covered making one return value clean. With a whole tool set, two extra concerns appear: handling large results, and staying consistent across tools.

Summarize big result sets. A tool that can return 500 destinations should not dump 500 destinations into the conversation. That floods the context, buries the useful part, and costs tokens on every subsequent turn. Return a summary or the top N, plus a way to get more:

# Before — a firehose
search_destinations("beaches") -> [ {500 full records...} ]

# After — top results plus a handle for the rest
{
  "total": 512,
  "showing": 5,
  "results": [ {top 5 concise records} ],
  "more": "call search_destinations again with page=2 for more"
}

The model gets what it needs to act now and a clear path to drill in if it wants — instead of choking on the whole set at once.

Keep return shapes consistent across tools. If every Atlas tool reports failure the same way and labels its fields the same way, Claude learns the pattern once and applies it everywhere. Drifting shapes — one tool returns a bare string, another a dict, a third raises on “not found” — force the model to handle each tool as a special case, and special cases are where mistakes hide.

# Consistent across the tool set: same keys, same "empty" convention
{ "ok": True,  "data": {...} }
{ "ok": False, "error": "no destinations matched 'volcano spa'" }

Concise, structured, and consistent is the goal: each result is small and labeled, and all of them speak the same dialect.


Practice Exercises

Exercise 1: Split an overloaded tool

A team built one travel_helper(action, payload) tool where action can be "search", "weather", "itinerary", or "book". Claude keeps sending malformed payloads and occasionally picks the wrong action. How would you redesign this?

Hint

Split it into separate, single-purpose tools: search_destinations, get_weather, build_itinerary, and book_trip. Each gets its own tight schema (no more catch-all payload) and a description with a clear “use this when…” clause. As a bonus, you’ve now isolated book_trip — the one side-effecting action — so you can gate just that tool behind confirmation instead of guarding a do-everything tool.

Exercise 2: Which tools are safe to run in parallel?

For these four tools — get_weather, search_destinations, build_itinerary, book_trip — decide which Claude can safely fire together in one turn and which must run alone, and explain why.

Hint

The first three are read-only: they fetch or compute and change nothing, so they’re safe to run in parallel and safe to retry. book_trip has a side effect (and likely an irreversible one), so it should not be run in parallel with anything and should be gated — this is a case for disable_parallel_tool_use and a confirmation step, plus an idempotency key so a retry doesn’t double-book.

Exercise 3: Redesign a firehose return value

search_destinations returns a list of 500 full records — every field for every match. The agent’s answers have gotten slow and vague. What’s going wrong, and what should the tool return instead?

Hint

Dumping 500 records floods the context window, buries the relevant matches, and re-costs those tokens on every following turn — so the model loses the thread and answers vaguely. Return a concise top-N (say the 5 best matches with just the fields that matter), plus the total count and a way to fetch more (a page argument or a more hint). The model gets enough to act now and a clear path to drill in.


Summary

A good tool set is more than a collection of good tools. Granularity is the first call: a few well-scoped tools that each do one clear thing beat both a do-everything tool and a sprawl of overlapping ones — if you can’t write a non-colliding “use this when…” clause for each, your granularity is off. Parallelism and side effects go together: Claude can run independent tools in one turn, and the same property that makes a tool safe to parallelize — not changing the world — makes it safe to retry. Read-only tools (search_destinations, get_weather, build_itinerary) are safe; side-effecting ones (book_trip, send_email, charge_card) must be made idempotent and gated behind confirmation, with disable_parallel_tool_use when a step must be sequential. Finally, return shapes should be concise, structured, and consistent: summarize big result sets to a top-N plus a way to get more, and use the same shape across every tool so the model learns the pattern once.

Key Concepts

  • Granularity — a few single-purpose, non-overlapping tools; avoid both the do-everything tool and tool sprawl.
  • Parallel tools — Claude can call independent tools in one turn; design read-only tools to run safely together; use disable_parallel_tool_use for must-be-sequential steps.
  • Side effects and idempotency — read-only tools are safe to retry and parallelize; world-changing tools must be idempotent and gated behind a confirmation or dry-run.
  • Return shapes — concise, structured, and consistent across tools; summarize large result sets to top-N plus a path to get more.

Why This Matters

These patterns prevent the failures that don’t show up until an agent has more than one tool and starts taking real actions: the wrong tool picked because two overlapped, a payment fired twice because a call was retried, an answer that drifts because a result buried the signal under 500 rows. Right-sizing, idempotency, gating, and consistent returns are cheap to design in up front and expensive to retrofit after something irreversible has gone wrong. In the next lesson you’ll put all of it together, hardening Atlas’s tools into a set that’s robust under real use.


Next Steps

Continue to Lesson 5 - Guided Project: Robust Atlas Tools

Put the design patterns to work — harden Atlas's tools into a set that's safe, consistent, and robust under real use.

Back to Module Overview

Return to the Designing Tools module overview


Continue Building Your Skills

You now have the design judgment that separates a dependable tool set from a confusing one: right-sized granularity with no overlap, an honest line between read-only tools and ones with side effects, idempotency and gating for irreversible actions, and concise, consistent return shapes. Next you’ll apply every lever from this module in a guided project — taking Atlas’s search_destinations, get_weather, and build_itinerary from “works in a demo” to “robust under real use.”