Lesson 5 - Guided Project: Retrieval-Augmented Atlas

Welcome to the Guided Project

Across this module you took retrieval apart piece by piece: a model’s knowledge is frozen and private-blind, so you built a KnowledgeBase that chunks and embeds your documents (Lesson 2), gave the agent a search_knowledge tool to look facts up mid-loop (Lesson 3), and added the discipline that turns “the model read some text” into “the agent gave a grounded answer” — answer only from what was retrieved, cite the source, refuse when there’s nothing (Lesson 4). Now you’ll put it all back together on one agent. Atlas, the trip-planning assistant you’ve been building all course, is going to plan from your travel guides: it searches a destination knowledge base, answers from the passages it retrieves, cites where each fact came from, and says “I don’t have that” instead of inventing — all while still recalling what it remembers about the traveler from Module 4.

By the end of this project, you will be able to:

  • Build Atlas a destination knowledge base from travel guides and confirm what it indexed
  • Give Atlas a search_knowledge retrieval tool it calls mid-plan to look up destinations
  • Ground Atlas’s answers in retrieved passages with inline citations, and gate out questions the knowledge base can’t cover
  • Combine memory and retrieval so Atlas plans from both what it remembers about the traveler and the facts in the knowledge base

We’ll build it in stages, reusing the exact pieces you wrote earlier in the module. Let’s give Atlas a grounded brain.


Stage 1: Build Atlas’s Destination Knowledge Base

Start with the KnowledgeBase you built in Lesson 2. It does three things: chunk splits a document into passages (one per paragraph, further split if long), embed turns each passage into a vector with the dependency-free keyword embedding carried over from Module 4, and search ranks passages against a query by cosine similarity. This is the store Atlas retrieves from.

import hashlib, re
import numpy as np

STOP = {"the","a","an","and","or","is","are","to","of","in","on","for","with",
        "you","your","it","its","as","at","by","be","this","that","from","has",
        "have","best","good","great"}

def embed(text, dim=1024):
    vec = np.zeros(dim)
    for tok in re.findall(r"[a-z]+", text.lower()):
        if tok in STOP:
            continue
        vec[int(hashlib.md5(tok.encode()).hexdigest(), 16) % dim] += 1.0
    norm = np.linalg.norm(vec)
    return vec / norm if norm else vec

def chunk(text, max_words=40):
    chunks = []
    for para in [p.strip() for p in text.split("\n\n") if p.strip()]:
        words = para.split()
        if len(words) <= max_words:
            chunks.append(para)
        else:
            for i in range(0, len(words), max_words):
                chunks.append(" ".join(words[i:i + max_words]))
    return chunks

class KnowledgeBase:
    def __init__(self):
        self.passages = []  # list of (source, text, vec)
    def add_document(self, source, text):
        for c in chunk(text):
            self.passages.append((source, c, embed(c)))
    def search(self, query, k=3):
        q = embed(query)
        scored = [(src, txt, float(q @ v)) for src, txt, v in self.passages]
        scored.sort(key=lambda r: -r[2])
        return [(s, t, round(sc, 3)) for s, t, sc in scored[:k] if sc > 0]

Now feed it Atlas’s travel guides. Each add_document takes a source label — the citation tag you’ll see later — and the guide text; chunk splits it into passages on the way in.

KYOTO = """Kyoto in autumn is famous for its fall foliage, which peaks in mid to late November.
Temperatures are mild, typically 10 to 18 degrees Celsius, ideal for walking temple grounds.

Arashiyama and the bamboo grove are a top autumn destination, best visited early morning to avoid crowds.

Many temples such as Tofuku-ji open special evening illuminations during the foliage season."""

SAPPORO = """Sapporo in winter is known for heavy snowfall and the February Snow Festival.
Temperatures often fall below freezing, so warm layers are essential."""

kb = KnowledgeBase()
kb.add_document("kyoto-guide", KYOTO)
kb.add_document("sapporo-guide", SAPPORO)
print(f"indexed {len(kb.passages)} passages")

Running it:

indexed 4 passages

Three Kyoto paragraphs and one Sapporo paragraph become 4 indexed passages, each stored as (source, text, vec). Atlas now has a destination knowledge base — small, but the mechanics are identical at a thousand documents. Next, let Atlas reach into it.


Stage 2: Give Atlas a search_knowledge Retrieval Tool

A knowledge base Atlas can’t reach is useless. The search_knowledge tool from Lesson 3 is how the agent triggers a lookup mid-plan: it’s an ordinary function wrapping kb.search, plus the JSON schema that tells the model the tool exists. When Atlas hits a destination question, it calls this tool, reads the retrieved passages, and reasons on.

Agentic retrieval as a tool. The agent loop runs left to right: a user asks 'When should I visit Arashiyama?'; the model decides to call the search_knowledge tool with the query 'Arashiyama bamboo grove best time'; the tool searches the knowledge base cylinder and returns the passage '[kyoto-guide] ...best visited early morning to avoid crowds'; that tool result flows back into the loop; the model reads it and produces a grounded, cited final answer. A caption notes retrieval is just one more tool the agent can call when it needs facts.
Retrieval as a tool: Atlas decides mid-loop to call search_knowledge, the tool returns real passages from the knowledge base, and Atlas reasons on with those facts in hand — agentic RAG.
def search_knowledge(query):
    hits = kb.search(query, k=2)
    if not hits:
        return "No relevant passages found."
    return "\n".join(f"[{src}] {txt}" for src, txt, _ in hits)

tools = [{"name": "search_knowledge",
          "description": "Search the travel knowledge base for relevant passages. "
                         "Use before answering questions about destinations.",
          "input_schema": {"type": "object",
                           "properties": {"query": {"type": "string"}},
                           "required": ["query"]}}]

The tool returns each hit prefixed with its [source] tag — so the citation travels with the passage all the way back to Atlas. That tag is what lets the final answer say [kyoto-guide] and mean it. The description tells the model when to reach for the tool: before answering anything about destinations. Atlas can now look things up; next, make it answer only from what it finds.


Stage 3: Ground the Answer, Cite the Source, Refuse Honestly

Retrieval without discipline is just a fancier guess. The answer_with_citations function from Lesson 4 adds the grounding gate: it searches the knowledge base, and before it ever calls the model, it checks whether the top hit clears a similarity floor. Real matches score 0.28 and up; keyword noise lands below 0.14. The MIN_SCORE = 0.15 floor sits between them — above it, Atlas answers from the passages and cites them; below it (or no hits at all), Atlas refuses without spending a single model call.

MIN_SCORE = 0.15  # similarity floor: real matches score 0.28+, keyword noise <0.14

def answer_with_citations(client, kb, question, *, system, k=3, model="claude-haiku-4-5"):
    hits = kb.search(question, k=k)
    # Grounding gate: if nothing relevant, refuse instead of hallucinating.
    if not hits or hits[0][2] < MIN_SCORE:
        return {"answer": "I don't have that in my knowledge base.", "sources": [], "grounded": False}

    context = "\n".join(f"[{i}] ({src}) {txt}" for i, (src, txt, _) in enumerate(hits, 1))
    prompt = (f"Answer using ONLY the sources below. Cite them inline like [1]. "
              f"If they don't cover it, say you don't know.\n\nSources:\n{context}\n\n"
              f"Question: {question}")
    r = client.messages.create(model=model, max_tokens=512, system=system,
                               messages=[{"role": "user", "content": prompt}])
    answer = "".join(b.text for b in r.content if b.type == "text")
    return {"answer": answer, "sources": [(src, sc) for src, _, sc in hits], "grounded": True}

Two things make this trustworthy. First, the prompt hands the model the retrieved passages and the rule “answer using ONLY the sources” — so Atlas summarizes fact instead of inventing. Second, the gate runs before the model: a question the knowledge base can’t cover returns the honest refusal with grounded: False and zero tokens spent. The model never gets a chance to hallucinate, because it’s never asked.


Stage 4: Tie In Memory — Two Different Kinds of Knowing

Atlas already has memory from Module 4. So now it has two ways of “knowing” — and keeping them straight is the whole point of this stage:

  • Memory is what Atlas remembers about the traveler. “This traveler is vegetarian.” “They prefer budget hostels.” These are durable facts about a person, written to a VectorMemory and recalled across sessions. The traveler told Atlas; Atlas remembered.
  • Retrieval is facts about the world, from the knowledge base. “Arashiyama is best visited early morning.” “Kyoto foliage peaks in late November.” These are facts about destinations, looked up in the travel guides at question time. Atlas didn’t know them — it retrieved them.

A real planning turn uses both. Atlas recalls the traveler’s preferences from memory and searches the knowledge base for destination facts, then plans from both at once:

# Memory: facts about the traveler (Module 4)
recalled = recall(memory, "food spots near Arashiyama")
# -> "Relevant things you remember about this traveler: The traveler is vegetarian..."

# Retrieval: facts about the destination (this module)
hits = kb.search("Arashiyama bamboo grove best time", k=2)
# -> [("kyoto-guide", "Arashiyama and the bamboo grove ... best visited early morning ...", 0.31)]

The split is clean: memory personalizes the plan (“vegetarian, on a budget”), retrieval grounds it (“go early, foliage peaks in November”). Atlas merges them into one prompt and produces a plan that’s both right for this traveler and right about this place.

Real agents combine memory and retrieval

This pairing is the backbone of almost every production agent. Memory answers “what do I know about this user?” — preferences, history, past decisions, pulled from a store you write to over time. Retrieval answers “what’s true about the world / the docs?” — facts looked up in a knowledge base at question time. A support bot recalls your account tier (memory) and retrieves the current refund policy (retrieval). A coding agent remembers your style (memory) and retrieves the library’s API docs (retrieval). Keep them as two stores with two jobs, and merge their hits into the prompt before the model answers.


Stage 5: Putting It Together — A Full Grounded Planning Turn

Now run the two verified behaviors that prove the whole pipeline works. First, the grounded, cited answer through the agentic loop — Atlas decides to search, reads the retrieved passage, and answers with a citation:

out = run_agent(client, "When should I visit Arashiyama?",
                system="Search the knowledge base before answering; cite the source.",
                tools=tools, tool_functions={"search_knowledge": search_knowledge})
print(out["answer"], "| steps:", out["steps"])
Visit Arashiyama and the bamboo grove early morning to avoid crowds [kyoto-guide]. | steps: 2

Trace it: step 1, Atlas calls search_knowledge; the tool returns the real Kyoto passage ("…best visited early morning to avoid crowds"); step 2, Atlas reads it and answers — citing [kyoto-guide] because that tag rode along with the passage. Two steps, every claim grounded in a retrieved source. (Example wording — exact phrasing varies, since the model’s text is verified against a mock; there’s no ANTHROPIC_API_KEY in this environment.)

Now the behavior that matters just as much — the honest refusal. Ask Atlas something its travel guides never mention:

out = answer_with_citations(client, kb, "What is the capital of Brazil?",
                            system="You are Atlas.")
print(out["answer"], "| grounded:", out["grounded"], "| model calls:", client.calls)
I don't have that in my knowledge base. | grounded: False | model calls: 0

The grounding gate caught it: the best hit fell below MIN_SCORE, so Atlas refused with zero model calls (client.calls == 0). It didn’t guess “Brasília” from the model’s training — it said it doesn’t know, because the knowledge base has nothing relevant. That’s the discipline that makes Atlas trustworthy: it answers from your guides or not at all.

That’s the full module working at once — a knowledge base Atlas indexes, a tool it retrieves through, grounded answers that cite their source, and an honest refusal when there’s nothing to ground in.


Practice Exercises

Exercise 1: Add a third guide and watch retrieval route to it

Add a fourth destination — say an osaka-guide about street food and the Dotonbori district. Then ask Atlas “where’s the best street food?” and confirm search returns the Osaka passage, not Kyoto or Sapporo.

Hint

Write an Osaka guide string, call kb.add_document("osaka-guide", OSAKA), and re-check len(kb.passages). Then kb.search("best street food", k=2) — the top hit’s source should be osaka-guide, because its passage shares the most words with the query. This is retrieval routing: with more documents, the query naturally pulls back the one that matches, and the [source] tag tells Atlas which guide to cite.

Exercise 2: Swap in real semantic embeddings

The keyword embed() only matches on shared words — “where can I grab a bite?” wouldn’t surface the “street food” passage because no words overlap. Replace it with sentence-transformers so retrieval fires on meaning.

Hint

Install sentence-transformers, load all-MiniLM-L6-v2, and replace embed(text) with model.encode(text). The KnowledgeBase class doesn’t change — add_document and search still call embed and take a dot product. Now “grab a bite” lands near “street food” in vector space even with no shared words. For a persistent, scalable store you’d move to a vector database like chromadb — the DataTweets Generative AI course walks through both. Re-tune MIN_SCORE afterward, since real embeddings produce a different score distribution.

Exercise 3: Persist the knowledge base to disk

The KnowledgeBase lives in a Python list, so it vanishes when the program exits — and re-indexing every document on each startup is wasteful at scale. Make it survive restarts.

Hint

You only need to persist the (source, text) pairs — the vectors can be recomputed by re-embeding each passage on load. Add a save(path) that writes the (source, text) pairs as JSON, and a load(path) that reads them back and rebuilds self.passages by re-embedding each text. On startup, load the file if it exists instead of re-add_document-ing every guide. With real semantic embeddings you’d persist the vectors too (or let a vector database like chromadb handle storage), since model.encode is far slower than the keyword version.


Summary

You assembled Module 6 into one grounded agent. Atlas builds a knowledge base with the KnowledgeBase from Lesson 2 — add_document chunks and embeds your travel guides, indexing the Kyoto and Sapporo documents into 4 passages. It reaches into that store through the search_knowledge tool from Lesson 3, which returns each hit tagged with its source so the citation travels with the passage. It answers with the grounding discipline from Lesson 4: answer_with_citations retrieves, checks the MIN_SCORE = 0.15 floor, and either answers from the passages with inline citations or refuses honestly. The verified behaviors prove it: asking about Arashiyama returned a cited, grounded answer in 2 steps ("…early morning to avoid crowds [kyoto-guide]"), and “What is the capital of Brazil?” returned “I don’t have that in my knowledge base.” with zero model calls. The pure-Python KnowledgeBase is real; the Claude-dependent pieces — the grounded answer, the refusal gate, and the agentic retrieval loop — were verified for real against an SDK-shaped mock client, since this environment has no ANTHROPIC_API_KEY.

Key Concepts

  • Knowledge base from documentsadd_document chunks and embeds your guides; the Kyoto and Sapporo docs index into 4 searchable passages.
  • Retrieval as a toolsearch_knowledge lets Atlas look up destinations mid-plan, returning passages tagged with their source.
  • Ground, cite, refuse — answer only from retrieved passages, cite the source inline, and gate out anything below MIN_SCORE with zero model calls.
  • Memory vs. retrieval — memory is what Atlas remembers about the traveler; retrieval is facts about the world from the knowledge base; real agents use both.

Why This Matters

This is the shape of almost every production agent: a store you write user facts to, a knowledge base you search for world facts, and the discipline to answer only from what you found. Get it right and Atlas plans trips grounded in your guides, cites where each fact came from, and admits what it doesn’t know — the difference between a demo and something you’d trust with a real traveler. Get it wrong and you ship an agent that confidently invents a refund policy or a foliage date. Atlas now retrieves, grounds, cites, and refuses — the foundation every trustworthy agent stands on.


Next Steps

Continue to Module 7 - Multi-Agent Systems

Specialized agents that collaborate — routing, orchestration, and handoffs.

Back to Module Overview

Return to the Retrieval-Augmented Agents module overview


Continue Building Your Skills

Atlas can now ground its trip plans in a knowledge base it searches and cites, refuse honestly when the guides come up empty, and combine those retrieved facts with what it remembers about the traveler. You’ve built a complete retrieval-augmented agent. The next module changes the shape of the problem entirely: instead of one agent doing everything, you’ll build teams of specialized agents that route work to each other, hand off tasks, and collaborate — turning Atlas from a soloist into an orchestra. On to multi-agent systems.