Module · 7 lessons

Working with LLMs in Python

Make your first calls to a large language model from Python — the Messages API, system prompts, conversations, tokens, cost, and streaming.

Start module Back to Generative AI & LLM Engineering

At a glance

Level

Intermediate

Lessons

7 lessons

Time to complete

1–2 weeks

Cost

Free forever · no sign-up

Welcome to Working with LLMs in Python, the first module of the Generative AI & LLM Engineering course. Before you can build tools, retrieval systems, or agents, you need to be fluent in the one operation everything else is built on: sending a prompt to a large language model and getting a useful answer back. That is what this module makes second nature.

You will start by understanding what a language model actually does — predicting one token at a time — and what the words tokens, context, and sampling really mean. Then you’ll write real Python: your first call to the Anthropic Messages API, shaping a model’s behavior with a system prompt, holding a multi-turn conversation, and reading the usage numbers that determine what a request costs. You’ll learn to stream responses so they appear word by word, and to control output length, stopping, and model choice.

Every example is real, runnable code against the live API. To keep your costs near zero while you learn, the lessons default to the inexpensive claude-haiku-4-5 model — and you’ll learn exactly when a more capable model is worth the price. You’ll keep your API key in an environment variable (never in your code), the way professionals do.

Start with Lesson 1, where you’ll learn what a large language model is really doing every time it writes a word.

Lessons in this module

1 How Large Language Models Work Understand what a large language model actually does — predicting one token at a time — and what tokens, context, and sampling really mean. 2 Your First Claude Call Install the Anthropic SDK, set your API key safely, create a client, and make your first real call to Claude — then dissect every field of the response. 3 System Prompts and Roles Learn the three roles in a Claude request — system, user, and assistant — and how the system prompt is your main lever for tone, format, and rules. 4 Multi-Turn Conversations Learn why the Claude API is stateless and how to hold a real conversation by resending a growing message history each turn. 5 Tokens, Cost, and Streaming Read token usage from a real response, compute the exact cost of a call, budget tokens before sending, and stream output so it appears as it is generated. 6 Controlling the Output Steer Claude's output with max_tokens, stop_sequences, and temperature, and learn how to pick the right model for the job. 7 Guided Project: A Command-Line Assistant Tie the whole module together by building a complete command-line chat assistant with a persona, memory, streaming, and live cost tracking.

Achievement

Complete all 7 lessons to finish the Working with LLMs in Python module.

Start module

Courses

DATATWEETS

Title here

Working with LLMs in Python

At a glance

Lessons in this module