Make your first calls to a large language model from Python — the Messages API, system prompts, conversations, tokens, cost, and streaming.
Welcome to Working with LLMs in Python, the first module of the Generative AI & LLM Engineering course. Before you can build tools, retrieval systems, or agents, you need to be fluent in the one operation everything else is built on: sending a prompt to a large language model and getting a useful answer back. That is what this module makes second nature.
You will start by understanding what a language model actually does — predicting one token at a time — and what the words tokens, context, and sampling really mean. Then you’ll write real Python: your first call to the Anthropic Messages API, shaping a model’s behavior with a system prompt, holding a multi-turn conversation, and reading the usage numbers that determine what a request costs. You’ll learn to stream responses so they appear word by word, and to control output length, stopping, and model choice.
Every example is real, runnable code against the live API. To keep your costs near zero while you learn, the lessons default to the inexpensive claude-haiku-4-5 model — and you’ll learn exactly when a more capable model is worth the price. You’ll keep your API key in an environment variable (never in your code), the way professionals do.
Start with Lesson 1, where you’ll learn what a large language model is really doing every time it writes a word.
Complete all 7 lessons to finish the Working with LLMs in Python module.