Lesson 1 - Why Agents Need to Plan

Welcome to Why Agents Need to Plan

Your agent works. It loops, calls tools, handles errors. So why does it sometimes flail on a harder request — calling tools in a confusing order, skipping a step, or producing an answer that’s confidently wrong? Because being able to act is not the same as knowing what to do next. A bare loop reacts one step at a time with no overall strategy. On simple tasks that’s fine. On multi-step tasks — “plan me a trip that fits my budget, the season, and my interests” — the agent needs reasoning structure: a way to break the goal down, think before each action, and check its own work. This module is about adding that structure, and this lesson sets up why it matters.

By the end of this lesson, you will be able to:

  • Explain why a bare tool loop struggles with multi-step tasks
  • Describe the reason-act-observe cycle and why explicit reasoning helps
  • Recognize the three patterns this module teaches and when each applies
  • See that these patterns are structure around the loop you already built

Let’s begin with what goes wrong without planning.


When a Bare Loop Isn’t Enough

Recall how the loop works: each turn, the model looks at everything so far and picks one next action. For a one- or two-step task, that’s plenty — “what’s the weather in Kyoto?” needs a single tool call and an answer. The model doesn’t need a strategy; the task is the strategy.

Hard tasks are different. “Plan a relaxed, budget-friendly 3-day autumn trip somewhere in Japan” has no single obvious first move. It depends on sub-decisions — where, when, how much, doing what — and those depend on each other. A bare loop, picking one action at a time with no plan, can lose the thread: it might draft an itinerary before checking the budget, or forget a constraint the user gave three turns ago. Nothing crashes — it just produces worse results. What’s missing isn’t capability; it’s deliberation.

The patterns in this module add that deliberation in three complementary ways: decompose the goal into steps, reason explicitly before each action, and reflect on the result before finishing.


Reason, Act, Observe

The core move is simple: get the model to think before it acts, and to use what it observes to think again. That cycle — reason, act, observe, repeat — is the backbone of everything that follows:

Reasoning and acting interleaved (ReAct). A Thought box ('To plan an outdoor day, I should check the weather') leads to an Action box (get_weather('Kyoto')), which leads to an Observation box ('16°C, clear'); a dashed arrow feeds the observation back so the result informs the next thought, and the cycle repeats. When the agent has enough, it exits to a Final answer reasoned from what it found. A caption notes the agent thinks, acts, sees the result, and thinks again — reasoning makes each next action a deliberate choice, not a guess.
The reason-act-observe cycle: the agent reasons about what to do, takes an action, observes the result, and reasons again — so each action is a deliberate choice informed by what it has learned.

Here’s the key insight: the agent loop you already built supports this for free. Remember that an assistant response can contain both a text block and a tool-use block. That text block is the agent’s reasoning — its “thought” — and the tool-use block is its action. When you encourage the model to explain its thinking before it calls a tool, you get reasoning interleaved with acting, with no change to the loop’s machinery. You’re not building a new engine; you’re getting more out of the one you have.


Three Patterns, One Loop

This module teaches three patterns, and it’s worth seeing up front how they relate — because they’re not rivals, they’re layers you can combine:

  • Decomposition (Lesson 2) — break a big goal into an ordered list of smaller steps, then carry them out. Planning before acting.
  • ReAct (Lesson 3) — interleave reasoning and acting during the loop: think, act, observe, think again. Planning while acting.
  • Reflection (Lesson 4) — after producing a result, critique it and revise. Planning after acting.

Notice the through-line: all three are about adding a thinking step the bare loop skips — before, during, or after the action. And all three are structure you wrap around the agent loop from Module 2, not replacements for it. A sophisticated agent often uses all three: decompose the task, reason through each step, and reflect on the result before finishing.

Reasoning is mostly a prompting move

A lot of “planning” comes down to asking the model to think. Telling the agent, in its system prompt, to “first outline a short plan, then carry it out” or “explain your reasoning before each tool call” often improves multi-step behavior dramatically — no new framework required. The patterns in this module give that instinct structure, but the underlying lever is simple: prompt for deliberation, and give the reasoning somewhere to go (a plan, a thought, a critique).


Practice Exercises

Exercise 1: Spot the multi-step task

Which of these likely needs planning/reasoning structure, and which is fine with a bare single step? (a) “What’s 50 USD in euros?” (b) “Find me a 5-day itinerary that stays under my budget, avoids cold weather, and includes good vegetarian food.”

Hint

(a) is a single step — one currency conversion and done; a bare loop handles it. (b) is genuinely multi-step with interacting constraints (budget, weather, food, dates), so it benefits from decomposition, explicit reasoning, and a reflection check — the patterns in this module.

Exercise 2: Where does the “thought” live?

In the agent loop, an assistant response can contain a text block and a tool-use block together. Which is the agent’s reasoning, and why does that mean ReAct needs no new machinery?

Hint

The text block is the reasoning (“thought”); the tool-use block is the action. Because the loop already returns both in one response and already feeds tool results back, you get reason-act-observe simply by prompting the model to think before it acts — no new loop is needed.

Exercise 3: Match the pattern

Match each to decomposition, ReAct, or reflection: (a) “before answering, list the steps you’ll take”; (b) “after drafting the plan, check it against the constraints and fix any misses”; (c) “explain why you’re calling each tool as you go.”

Hint

(a) decomposition — planning the steps before acting; (b) reflection — critiquing and revising after producing a result; (c) ReAct — reasoning interleaved with each action during the loop.


Summary

A bare tool loop acts one step at a time with no overall strategy, which is fine for simple tasks but causes it to wander on multi-step ones — wrong order, missed steps, confidently flawed answers. The fix is reasoning structure, not a bigger model. The backbone is the reason-act-observe cycle: think, act, observe, think again — and crucially, the loop you already built supports it, because an assistant response carries both a reasoning (text) block and an action (tool-use) block. This module’s three patterns each add a thinking step the bare loop skips: decomposition (plan before acting), ReAct (reason while acting), and reflection (critique after acting). They layer on top of your existing loop and combine freely.

Key Concepts

  • Why plan — multi-step tasks need deliberation, not just the ability to act.
  • Reason-act-observe — think, act, observe, repeat; the loop already supports it.
  • Three patterns — decomposition (before), ReAct (during), reflection (after).
  • Structure around the loop — these wrap your existing agent loop; they don’t replace it.

Why This Matters

The gap between a flashy agent demo and one you’d actually rely on is mostly reasoning structure. Knowing why and when to add it — and that it layers onto the loop you already have — is what lets you take an agent from “works on easy questions” to “handles genuinely hard, multi-step tasks.” It also keeps you from over-engineering: many problems just need a short “think first” prompt, not a framework. Next, you’ll build the first pattern for real — decomposing a hard goal into steps the agent can actually execute.


Next Steps

Continue to Lesson 2 - Task Decomposition

Break a big goal into an ordered list of steps, then execute them one at a time.

Back to Module Overview

Return to the Planning and Reasoning module overview


Continue Building Your Skills

You now know why agents need reasoning structure, the reason-act-observe cycle at its heart, and the three patterns this module layers onto your loop. Next you’ll build the first one — task decomposition — turning a hard goal into an ordered sequence the agent can actually carry out.