Lesson 4 - Your First Claude Call

Welcome to Your First Claude Call

You’ve spent the last three lessons building the mental model of an agent: a language model in a decide-act-observe loop, choosing its own actions with tools. Now it’s time to make the model actually respond. Before you can wrap Claude in a loop, you need to be able to call it once — send it a message and get text back. That single call is the atom the entire agent is built from. Master it here, and the loop in the next module is just this call, repeated, with the model’s previous answers fed back in. This lesson takes you from an empty terminal to a working response from Claude, using the official Anthropic Python SDK.

By the end of this lesson, you will be able to:

  • Install the anthropic SDK and store your API key as an environment variable
  • Create a client and send a single message to Claude with messages.create
  • Explain what model, max_tokens, system, and messages each control
  • Read the reply text, the stop reason, and the token usage from the response

Let’s get the SDK installed and your key in place — safely.


Install the SDK and Set Your API Key

Two things stand between you and your first call: the SDK, and a key.

The SDK is one pip install away:

pip install anthropic

The key, you get from the Anthropic Console. Create an account, open the API keys section, and generate a new key. It’s a secret — anyone who has it can spend money against your account — so you treat it like a password, which means it must never appear in your source code.

The standard, safe way to give your program a secret is an environment variable: a value that lives in your shell’s environment, outside your code, that your program reads at runtime. The Anthropic SDK looks for one specific variable, ANTHROPIC_API_KEY, and uses it automatically. Set it in your terminal before running your script:

export ANTHROPIC_API_KEY="your-key-goes-here"

Why an environment variable instead of just pasting the key into a string? Three reasons, and they all matter: (1) your code can be shared, committed to Git, or screenshared without leaking the secret; (2) different machines (your laptop, a teammate’s, a server) can each supply their own key without touching the code; (3) if a key is ever exposed, you rotate the environment variable rather than hunting through source files. The key lives in the environment; the code only knows the name of the variable.

Because a missing key is the most common reason a first call fails with a confusing error, it’s worth a one-line check at the top of your script that fails loudly and clearly:

import os

assert os.environ.get("ANTHROPIC_API_KEY"), "Set ANTHROPIC_API_KEY in your environment first"

If that line passes, the SDK will find your key on its own — you never have to pass it in by hand.


The Client and a Single Message

With the key in the environment, the SDK gives you a client — the object you make all calls through. Creating it takes no arguments: it reads ANTHROPIC_API_KEY from the environment for you.

Here is the complete first call. We’ll frame it around Atlas, the trip-planning assistant you’ll build across this course: the system prompt is what makes a general-purpose model behave like Atlas.

import os
from anthropic import Anthropic

client = Anthropic()  # reads ANTHROPIC_API_KEY from the environment

response = client.messages.create(
    model="claude-haiku-4-5",
    max_tokens=300,
    system="You are Atlas, a concise trip-planning assistant.",
    messages=[
        {"role": "user", "content": "Name one Japanese city for a 3-day autumn food trip, in one sentence."},
    ],
)

answer = response.content[0].text
print(answer)
print(response.stop_reason)                                   # e.g. "end_turn"
print(response.usage.input_tokens, response.usage.output_tokens)

Every call to client.messages.create(...) is configured by four pieces. Learn what each one does, because you’ll set all four on every call you ever make:

  • model — which Claude model answers. We use claude-haiku-4-5, the fast, low-cost model, as the default throughout this course. It’s more than capable for learning to build agents, and it keeps your bill small while you experiment.
  • max_tokens — a hard ceiling on the length of the reply. The model will never produce more than this many tokens in its answer. It’s a safety cap, not a target: a short reply stays short. We set 300 here because we asked for a single sentence.
  • system — the instructions that define how the model should behave. This is the system prompt, and it’s where Atlas’s personality lives: “You are Atlas, a concise trip-planning assistant.” Change this string and you change what kind of assistant you’re talking to.
  • messages — the conversation so far, as a list. Each entry is a dict with a role ("user" for things you say, "assistant" for things Claude has said) and content (the text). Right now there’s one user turn; in the next module, you’ll append Claude’s replies back into this list to build a real conversation — that’s the loop.

Run it, and Claude responds. Next, let’s read what comes back.


Reading the Response

The object returned by messages.create carries more than just the answer text. The three fields you’ll reach for constantly are content, stop_reason, and usage.

response.content is a list of content blocks, not a plain string. For a simple text reply there’s one block, and its text is at response.content[0].text. (It’s a list because Claude can return richer responses — multiple blocks, or tool-use requests — which is exactly what makes the agent loop possible later. For now, one text block is all you need.)

response.stop_reason tells you why the model stopped generating. On a normal, complete answer this is "end_turn" — the model finished what it wanted to say. The other one to watch for is "max_tokens", which means the reply hit your length cap and was cut off mid-thought; if you see that, raise max_tokens.

response.usage reports the token counts for the call: input_tokens (what you sent — system prompt plus messages) and output_tokens (what Claude generated). Those two numbers are what you’re billed on, so usage is how you track cost.

Now, about the output. Claude’s text is non-deterministic — it depends on the model, your exact prompt, and a degree of run-to-run variation — so we can’t print one fixed answer and call it “the output.” What we can tell you is the shape of what you’ll see:

  • answer will be a one-sentence city suggestion. As one example of the kind of answer you’ll see (wording will vary every run): “For a 3-day autumn food trip, head to Osaka — Japan’s street-food capital.” Don’t expect that exact sentence; expect a single concise sentence naming a city.
  • response.stop_reason will be "end_turn" on a normal finish.
  • response.usage will report small token counts — for a prompt and reply this short, expect roughly two-digit numbers for each of input_tokens and output_tokens. The exact values vary per run.

That’s the whole transaction: configure four parameters, send, and read three fields back. Wrap this call in a loop that feeds the reply back into messages, and you have an agent.

Never hardcode or commit your API key

This is the single most common — and most expensive — beginner mistake. Do not paste your key into your code as Anthropic(api_key="sk-..."), and do not commit it to version control. A key pushed to a public repository is scraped within minutes and used to run up charges on your account. Always load it from the environment (ANTHROPIC_API_KEY) — which the SDK does automatically — or from a .env file that is listed in your .gitignore so it never gets committed. If a key ever leaks, revoke it in the Anthropic Console immediately and generate a new one.


Practice Exercises

Exercise 1: Change the system prompt

Take the verified example and change the system prompt from “You are Atlas, a concise trip-planning assistant.” to “You are Atlas, an enthusiastic trip-planning assistant who loves giving detailed reasons.” Before running it, predict how the reply will change. Then run it and check.

Hint

The system prompt steers behavior, so the reply should get longer, warmer, and more reason-heavy — even though the user message is identical. You may also bump into your max_tokens=300 cap if the model gets too detailed; if stop_reason comes back as "max_tokens", the reply was cut off and you’d raise the cap. This is the core lesson: the system prompt, not the user’s words, defines who the assistant is.

Exercise 2: What does max_tokens limit?

Set max_tokens=10 and run the call again with the original system prompt. Does the model still try to name a city in a full sentence? What does response.stop_reason say now?

Hint

max_tokens caps the output length only — it does not tell the model “be brief,” it just stops generation once the limit is reached. With max_tokens=10 the reply will almost certainly be truncated mid-sentence, and stop_reason will be "max_tokens" instead of "end_turn". That’s your signal that the answer is incomplete and the cap is too low for the task.

Exercise 3: Find the cost of a call

After a call returns, you want to know how many tokens it used so you can estimate cost. Which field holds that, and which two numbers make it up?

Hint

response.usage holds it. response.usage.input_tokens is what you sent (system prompt + messages) and response.usage.output_tokens is what Claude generated. Multiply each by the model’s per-token input and output price and add them together to get the cost of that one call. Tracking usage across many calls is exactly how you’ll keep an agent’s running cost in check later.


Summary

You made your first call to Claude. You installed the anthropic SDK with pip install anthropic, got an API key from the Anthropic Console, and stored it as the ANTHROPIC_API_KEY environment variable — never in your code — so the SDK could read it automatically. You created a client with Anthropic() and sent one message via client.messages.create(...), setting four parameters: model (claude-haiku-4-5, the low-cost default for this course), max_tokens (a hard cap on the reply length), system (the instructions that turn the model into Atlas the trip planner), and messages (the conversation as a list of {"role", "content"} dicts). You read three fields off the response: the reply text at response.content[0].text, response.stop_reason ("end_turn" on a normal finish), and response.usage (input_tokens and output_tokens for cost). And you learned that exact text and token counts vary per run, because the model’s output is non-deterministic.

Key Concepts

  • The clientAnthropic() reads ANTHROPIC_API_KEY from the environment automatically; never hardcode the key.
  • messages.create — one call configured by model, max_tokens, system, and messages.
  • messages — a list of {"role": "user"|"assistant", "content": ...} turns; the system prompt defines behavior.
  • Reading the response — text at response.content[0].text; response.stop_reason ("end_turn"); response.usage for token counts.

Why This Matters

This single call is the building block the entire agent loop wraps. In the next module you’ll take exactly this messages.create call, put it inside a loop, and append each of Claude’s replies back into the messages list so the model can build on what it said before — that’s the difference between a one-shot answer and a multi-step agent. Everything else (tools, memory, planning) is layered on top of this one operation. Get comfortable here — sending a message, reading content, checking stop_reason, watching usage — and the loop will feel like a natural extension rather than a leap.


Next Steps

Continue to Lesson 5 - Guided Project: Design the Atlas Agent

Put the pieces together — sketch the design of the Atlas trip-planning agent you'll build.

Back to Module Overview

Return to the Agent Foundations module overview


Continue Building Your Skills

You can now make Claude respond — install, key, client, message, response. That’s the atom of every agent you’ll ever build. Next, you’ll bring together everything from this module and design the Atlas agent itself: deciding its goal, its tools, and how this single call becomes a loop that plans a trip.