Module · 4 lessons

Embeddings & Semantic Search

Turn text into vectors that capture meaning — generate embeddings, measure similarity, and build a search engine that finds answers by meaning, not keywords.

Start module Back to Generative AI & LLM Engineering

At a glance

Level

Intermediate

Lessons

4 lessons

Time to complete

1 week

Cost

Free forever · no sign-up

Welcome to Embeddings & Semantic Search, the fifth module of the Generative AI & LLM Engineering course. So far you’ve worked with models that read and write text. This module is about a different superpower: turning text into numbers that capture meaning. An embedding is a list of numbers — a vector — that represents a piece of text so that similar meanings end up close together in space. That single idea powers search, recommendations, clustering, and the retrieval systems you’ll build in the next two modules.

You’ll start with the intuition: why “How do I reset my password?” and “I forgot my login” should land near each other even though they share almost no words. Then you’ll generate real embeddings in Python — locally and for free — with sentence-transformers, and see the actual 384-dimensional vectors. You’ll learn to measure similarity with cosine similarity and distance, and finally build a semantic search engine that answers questions over a real FAQ corpus by meaning, not keyword matching.

Every example runs for real on your own machine — no API key and no cost, because the embedding model runs locally. (We’ll also point out hosted options like Voyage AI for when you want managed, larger models.) By the end you’ll have a working search engine and the foundation for vector databases and retrieval-augmented generation, the two modules that come next.

Start with Lesson 1, where you’ll build the intuition for what an embedding actually is.

Lessons in this module

1 What Embeddings Are Understand embeddings — vectors that capture the meaning of text so that similar meanings sit close together — and why they power search far better than keywords. 2 Generating Embeddings Install sentence-transformers and generate real embeddings in Python — load a model, encode text one item or a whole batch at a time, and inspect the vectors. 3 Measuring Similarity and Distance Compare embeddings with cosine similarity — the angle between vectors — compute it by hand with numpy and with sentence-transformers, and relate it to distance. 4 Guided Project: Semantic Search Build a working semantic search engine: embed a real FAQ corpus, rank answers by cosine similarity to a user's question, and return the best matches by meaning.

Achievement

Complete all 4 lessons to finish the Embeddings & Semantic Search module.

Start module

Courses

DATATWEETS

Title here

Embeddings & Semantic Search

At a glance

Lessons in this module