Lesson 5 - Guided Project: LangGraph Agent

Welcome to the LangGraph Agent Project

This is the capstone for the module, and it closes a loop you started two modules ago. In Module 8 you built a research assistant by hand: you wrote the think-act-observe loop yourself, dispatched tool calls, threaded conversation state through every turn, and grounded answers in a document store. It worked, and building it taught you exactly what an agent is. Now you’ll rebuild that same assistant on LangGraph — the same capabilities, the same multi-step reasoning, in a fraction of the code. You’ll wire a Chroma-backed retrieval tool, a calculator, a planning system prompt, and persistent memory into one agent, then watch it plan, retrieve, compute, and answer real questions about a product handbook. Download the dataset first: the Acme Cloud Notes product handbook, saved next to your script as product-handbook.md.

By the end of this project, you will be able to:

  • Build a knowledge base by splitting a document into chunks and indexing them in a Chroma vector store
  • Define agent tools — a retrieval tool over your vector store and a calculator — with the @tool decorator
  • Assemble a planning agent with create_agent, a system prompt, and a checkpointer for memory
  • Run multi-step questions and read the real tool trajectory the agent followed

This is the Module 8 research assistant, rebuilt on the framework. Let’s build it.


Build the Knowledge Base

An agent that grounds its answers needs something to search. We’ll load the handbook, split it into overlapping chunks, embed those chunks locally with a small sentence-transformer model, and store them in a Chroma vector store. The embeddings run on your machine for free — no API key — so the only paid calls in this whole project are to Claude.

import warnings; warnings.filterwarnings("ignore")
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_chroma import Chroma
from langchain_text_splitters import RecursiveCharacterTextSplitter

text = open("product-handbook.md").read()
chunks = RecursiveCharacterTextSplitter(
    chunk_size=500, chunk_overlap=100
).split_text(text)
print("number of chunks:", len(chunks))

emb = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vs = Chroma.from_texts(chunks, embedding=emb, collection_name="handbook")
retriever = vs.as_retriever(search_kwargs={"k": 3})
number of chunks: 14

The handbook splits into 14 chunks. split_text returns a plain list of strings, Chroma.from_texts embeds and indexes them in one call, and as_retriever(search_kwargs={"k": 3}) gives us an object that returns the 3 most relevant chunks for any query. This is the entire Module 7 RAG indexing step, compressed to a handful of lines — and now it becomes one tool the agent can reach for.


Define the Tools

The agent needs two abilities: look things up in the handbook, and do arithmetic. Each is a normal Python function wrapped with the @tool decorator. The docstring is not decoration — Claude reads it to decide when to call the tool, so write it like an instruction.

from langchain_core.tools import tool

@tool
def search_docs(query: str) -> str:
    """Search the Acme Cloud Notes handbook for relevant passages."""
    hits = retriever.invoke(query)
    return "\n\n".join(d.page_content for d in hits)

@tool
def calculator(expression: str) -> str:
    """Evaluate a basic arithmetic expression."""
    return str(eval(expression, {"__builtins__": {}}, {}))

search_docs simply calls the retriever you built and joins the returned chunks into one string — that string is what the agent gets back when it searches. calculator evaluates an arithmetic expression in a locked-down namespace (no builtins), so the model can do math reliably instead of guessing. A quick check that retrieval works before we hand it to the agent:

print(search_docs.invoke("how much does Pro cost")[:200])
Acme Cloud Notes offers three plans. The Starter plan is free forever and
includes up to 100 notes and 1 GB of storage. The Pro plan costs 8 dollars
per month and raises those limits to unlimited note

The right chunk comes back — the plans-and-billing section, which names the 8-dollar Pro price. The retrieval tool works. Now we give both tools to an agent.


Create the Agent

create_agent is the whole think-act-observe loop you wrote by hand, provided for you. You hand it a model, your list of tools, a system prompt that tells it how to behave, and a checkpointer that gives it memory. The system prompt here is a planning prompt: it tells the agent to search before it asserts, to compute with the calculator instead of doing mental math, and to admit when the docs don’t have the answer.

from langchain_anthropic import ChatAnthropic
from langchain.agents import create_agent
from langgraph.checkpoint.memory import InMemorySaver

model = ChatAnthropic(model="claude-haiku-4-5", max_tokens=400)

system_prompt = (
    "You are a research assistant for Acme Cloud Notes. "
    "Plan before you act: first use search_docs to ground every factual claim "
    "in the handbook, then use the calculator for any arithmetic. "
    "If the handbook does not contain the answer, say so plainly instead of guessing."
)

agent = create_agent(
    model,
    tools=[search_docs, calculator],
    checkpointer=InMemorySaver(),
    system_prompt=system_prompt,
)
cfg = {"configurable": {"thread_id": "session1"}}

The ANTHROPIC_API_KEY environment variable is read automatically by ChatAnthropic — you never put the key in your code. The InMemorySaver checkpointer stores the conversation state, and the thread_id in cfg names the conversation so follow-up turns can find it. Everything is wired. Time to ask it something that needs more than one step.


Run a Question That Needs Both Tools

Here is the question that exercises the whole agent at once: it requires retrieval (find the Pro price in the docs) and then computation (multiply by twelve). The agent has to plan the order itself.

def trajectory(result):
    for m in result["messages"]:
        t = type(m).__name__
        if t == "HumanMessage":
            print(f"[Human] {m.content}")
        elif t == "AIMessage":
            for tc in m.tool_calls:
                print(f"[AI -> tool] {tc['name']}({tc['args']})")
            if isinstance(m.content, str) and m.content:
                print(f"[AI] {m.content}")
        elif t == "ToolMessage":
            c = m.content if len(m.content) < 70 else m.content[:70] + "..."
            print(f"[Tool result] {c}")

r1 = agent.invoke({"messages": [("human",
    "If I'm on the Pro plan, how much do I pay for a full year? "
    "Ignore the annual discount.")]}, cfg)
trajectory(r1)
[Human] If I'm on the Pro plan, how much do I pay for a full year? Ignore the annual discount.
[AI -> tool] search_docs({'query': 'Pro plan pricing monthly cost'})
[Tool result] Acme Cloud Notes offers three plans. The Starter plan is free fo...
[AI -> tool] calculator({'expression': '8 * 12'})
[Tool result] 96
[AI] If you're on the Pro plan at $8 per month, you would pay $96 for a full year (ignoring the annual discount).

Read the trajectory top to bottom — this is the agent’s reasoning made visible. It searched the docs first, found the 8-dollar Pro price, then called the calculator with 8 * 12, got 96 back, and only then composed the final answer. Nobody told it that order; the planning prompt set the policy and the model chose the steps. That is exactly the loop you hand-coded in Module 8, now driven by the framework.


Run a Memory Follow-Up

Because the agent has a checkpointer and we reuse the same thread_id, the next question can refer back to “that plan” with no extra context. The prior turn is already in memory.

r2 = agent.invoke({"messages": [("human",
    "And what does that plan add compared to the free Starter plan?")]}, cfg)
print(r2["messages"][-1].content)
Based on the information I already found in the handbook, the Pro plan adds
the following compared to the free Starter plan:

1. Unlimited notes (vs. 100 notes on Starter)
2. 50 GB of storage (vs. 1 GB on Starter)
3. Version history (not included on Starter)

Notice what the agent did not do: it didn’t search again. The Pro-versus-Starter details were already in the chunk it retrieved on the previous turn, and the checkpointer kept that context alive, so the model answered directly from memory. “That plan” resolved to the Pro plan because the whole conversation — every message and tool result — persists across calls under the same thread.


Run a Question the Docs Can’t Answer

A grounded assistant has to know its limits. We ask about a feature the handbook never mentions, and the planning prompt’s last instruction kicks in.

r3 = agent.invoke({"messages": [("human",
    "Does Acme Cloud Notes have a mobile widget for the home screen?")]}, cfg)
trajectory(r3)
[Human] Does Acme Cloud Notes have a mobile widget for the home screen?
[AI -> tool] search_docs({'query': 'mobile widget home screen'})
[Tool result] from Account Settings, which asks for a one-time code from your p...
[AI -> tool] search_docs({'query': 'widget features mobile app'})
[Tool result] ## Support  Starter and Pro users can reach support by email at h...
[AI] The handbook doesn't contain information about mobile widgets for the home screen. I'm unable to confirm whether Acme Cloud Notes offers this feature based on the available documentation.

The agent tried twice — two different search queries — got back chunks about two-factor codes and support, recognized that neither was relevant, and declined rather than inventing a feature. That refusal is the difference between a research assistant you can trust and a confident hallucination machine. The grounding policy held even when retrieval came up empty.

Look at how little code this took

Your Module 8 research assistant needed dozens of lines just for the loop: a while loop, manual tool dispatch, a growing messages list you appended to by hand, and your own memory bookkeeping. Here the entire agent — loop, dispatch, and memory — is one create_agent call plus a thread_id. The work you do is now the part that’s actually unique to your application: the tools, the prompt, and the knowledge base. This is production-shaped code. In Module 10 you’ll take an agent like this one and ship it as a real, reliable application.


Extend the Project

Exercise 1: Add a third tool

Give the agent a today() tool that returns the current date, then ask it a question that combines it with the docs — for example, “If I delete my account today, by what date will my data be erased?” (the handbook says within 30 days).

Hint

Write from datetime import date and a @tool that returns str(date.today()), then add it to the tools=[...] list. Date arithmetic is awkward for the calculator, so let the model reason about “30 days from today” from the date string, or add a small date-math helper tool.

Exercise 2: Stream the agent

Instead of agent.invoke(...), call agent.stream(...) and print each step as it arrives, so you watch the trajectory unfold live instead of seeing it only after the agent finishes.

Hint

for step in agent.stream({"messages": [...]}, cfg, stream_mode="values"): yields the state after each node. Print step["messages"][-1] each time to see the newest message — the tool call, the tool result, then the final answer — appear one at a time.

Exercise 3: Add a citation step

Change search_docs so the agent can cite its sources. Instead of joining only the chunk text, prefix each chunk with a short label, and add a line to the system prompt asking the agent to mention which sections it used.

Hint

In search_docs, return something like "[Source 1] " + d.page_content for each hit. Then add “Cite which sources you used by their labels” to the system prompt. The model will carry those labels into its answer because they arrive inside the tool result it reads.


Summary

You rebuilt the Module 8 research assistant on LangGraph and watched it solve real multi-step problems. You started by building a knowledge base — splitting the handbook into 14 chunks and indexing them in a Chroma vector store with local embeddings — then exposed it as a search_docs tool alongside a calculator. You assembled the agent with a single create_agent call, gave it a planning system prompt and a checkpointer for memory, and ran three questions that each tested something different: one needing both retrieval and computation (it searched, then multiplied 8 by 12 to get 96), a memory follow-up it answered from prior context without searching again, and one outside the docs that it correctly declined to answer. Reading the trajectories, you saw the agent plan its own tool order — the exact loop you once wrote by hand, now provided by the framework.

Key Concepts

  • Knowledge base — document chunks embedded and stored in a vector store, queried through a retriever.
  • Retrieval tool — a @tool that calls the retriever and returns matching passages, giving the agent grounded search.
  • Planning system prompt — instructions that set the agent’s policy: search first, compute with the calculator, decline when the docs are silent.
  • Checkpointer + thread_id — persistent memory that carries the whole conversation across turns, so follow-ups resolve references like “that plan.”
  • Trajectory — the ordered sequence of tool calls and results that reveals how the agent reasoned to its answer.

Why This Matters

This is the shape of a real production agent: tools over your own data, a calculator for exactness, a prompt that enforces grounding, and memory across a session — all in code short enough to read in one sitting. Because you built every one of these pieces by hand first, you can read the trajectory and know precisely what the framework is doing at each step, debug it when it misbehaves, and decide what to add. That combination — understanding the internals and wielding the framework — is what lets you ship LLM systems that you can actually trust and maintain.


Next Steps

Continue to Module 10 - Shipping AI Applications

Take agents like this one and deploy them as real, reliable applications — APIs, evaluation, monitoring, and the engineering that turns a working prototype into production software.

Back to Module Overview

Return to the LangChain & LangGraph module overview


Continue Building Your Skills

You’ve built a complete, grounded, multi-step agent on a production framework and seen it plan, retrieve, compute, remember, and decline. Next you’ll turn systems like this into shipped software — deployed, evaluated, and reliable enough to put in front of real users.