Lesson 1 - Running SQL Queries with sqlite3

Welcome to Querying SQLite from Python

So far you have written SQL inside a database tool and read the results in a table. In this lesson you bring SQL into Python. You will connect to a real database file, run a query from your own code, and get the results back as Python objects you can loop over, slice, and pass to other functions.

The tool that makes this possible is sqlite3, a library that comes built into Python — there is nothing to install. SQLite is a database that needs no separate server; it stores an entire database in a single file on disk. That makes SQLite ideal for datasets that are too large to comfortably hold in memory but small enough to fit on disk, and it makes databases easy to share: you can hand someone a single .db file.

By the end of this lesson, you will be able to:

  • Connect to a SQLite database from Python with sqlite3.connect()
  • Create a cursor and run SQL queries with execute()
  • Fetch results as a list of tuples and read individual values out of them
  • Retrieve a specific number of rows with fetchone() and fetchmany()
  • Close a database connection cleanly when you are done

A little SQL is all you need to start. Let’s begin.

Data for this lesson

Database: jobs.db — U.S. recent-graduate job statistics (SQLite).

Tables used: recent_grads

Open it in any SQLite client — DB Browser for SQLite, the sqlite3 CLI, or Python’s sqlite3 — to run every query yourself.


The Data and the Workflow

Throughout this lesson you will work with the American Community Survey data on recent college graduates — the same dataset you may have seen in earlier SQL courses. It lives in a table named recent_grads inside a database file called jobs.db. Each row describes one college major and how its graduates fared in the job market.

A small preview of the table looks like this:

RankMajor_codeMajorMajor_categoryTotalSample_sizeMenWomenShareWomenEmployed
12419PETROLEUM ENGINEERINGEngineering23393620572820.1205641976
22416MINING AND MINERAL ENGINEERINGEngineering7567679770.101852640
32415METALLURGICAL ENGINEERINGEngineering85637251310.153037648
42417NAVAL ARCHITECTURE AND MARINE ENGINEERINGEngineering12581611231350.107313758
52405CHEMICAL ENGINEERINGEngineering3226028921239110210.34163125694

A few of the columns you will use:

  • Rank — the major’s rank by median earnings
  • Major — the name of the major
  • Major_category — the broader category the major belongs to
  • Total — the total number of graduates
  • ShareWomen — women as a share of graduates, from 0 to 1
  • Employed — the number of employed graduates

The workflow you will follow is the same one you will use with almost any database from Python:

1. Connect    →  open a connection to the database file
2. Cursor     →  create a cursor to run queries through
3. Execute    →  send a SQL query string to the database
4. Fetch      →  pull the results back into Python
5. Close      →  release the connection

Keep these five steps in mind — every example below is just a variation on them.


Connecting to the Database

Because sqlite3 ships with Python, you start simply by importing it:

import sqlite3

To open a database, you call sqlite3.connect() and pass the name of the database file. Since the database is just a file on disk, the argument is the file’s path as a string:

import sqlite3

conn = sqlite3.connect("jobs.db")
print(conn)

Output:

<sqlite3.Connection object at 0x7f9c1a0b3e40>

The value returned is a Connection object, stored here in conn. The Connection represents your live link to the database; you will use it to create cursors and, at the end, to close the connection.

One connection at a time

When you connect, SQLite locks the database file so that no other process can write to it at the same time. This is a deliberate design choice that keeps SQLite simple and lightweight — it avoids the complexity of coordinating many programs writing to one file. The practical consequence is that you should open a connection, do your work, and close it promptly so other processes (and other team members) can use the database.

Note

If you pass a filename that does not exist, sqlite3.connect() will create a brand-new empty database file with that name rather than raise an error. When you mean to open an existing database, double-check the path is correct.


Cursors and Tuples

Before you can run a query, you need to understand two ideas: the cursor that carries your query to the database, and the tuple that each row comes back as.

What a cursor does

While the Connection represents the database as a whole, a cursor is the object that actually does the querying work. A cursor will:

  • Run a SQL query against the database
  • Parse the results that come back
  • Convert those results into native Python objects
  • Hold onto the results so you can fetch them

You create a cursor by calling the Connection’s cursor() method:

cursor = conn.cursor()

Rows come back as tuples

When a cursor returns rows, it represents each row as a tuple. A tuple is a built-in Python data structure that holds an ordered sequence of values, much like a list. The key difference is that tuples are immutable — once created, you cannot change their contents. SQLite uses tuples for rows because they are slightly faster and lighter than lists, which matters when a result set has many rows.

You create an empty tuple with a pair of parentheses, and you read values out of a tuple with bracket notation, indexing from 0:

# An empty tuple
t = ()

# A tuple with two values
fruit = ("Apple", "Banana")

apple = fruit[0]
banana = fruit[1]

print(apple)
print(banana)

Output:

Apple
Banana

When you query recent_grads, each row will arrive as a tuple like (1, 2419, 'PETROLEUM ENGINEERING', 2339.0, ...), with one value per column in the order the columns appear in the query.


Running a Query

With a cursor in hand, you run a query in two steps. First you write the query as a string, then you pass that string to the cursor’s execute() method:

cursor = conn.cursor()

# Write the SQL query as a string
query = "select * from recent_grads;"

# Send the query to the database
cursor.execute(query)

Calling execute() runs the query but does not, by itself, hand you the rows. To collect every row of the result set as a list of tuples, you call fetchall():

# Fetch the full result set as a list of tuples
results = cursor.fetchall()

# Look at the first three rows
print(results[0:3])

Output:

[(1, 2419, 'PETROLEUM ENGINEERING', 2339.0, 2057.0, 282.0, 'Engineering', 0.120564344, 36, 1976, 1849, 270, 1207, 37, 0.018380527, 110000, 95000, 125000, 1534, 364, 193),
 (2, 2416, 'MINING AND MINERAL ENGINEERING', 756.0, 679.0, 77.0, 'Engineering', 0.101851852, 7, 640, 556, 170, 388, 85, 0.117241379, 75000, 55000, 90000, 350, 257, 50),
 (3, 2415, 'METALLURGICAL ENGINEERING', 856.0, 725.0, 131.0, 'Engineering', 0.153037383, 3, 648, 558, 133, 340, 16, 0.024096386, 73000, 50000, 105000, 456, 176, 0)]

Each tuple holds all 21 columns of recent_grads, in the order they appear in the table. Because results is an ordinary Python list, you can slice it (results[0:3]), loop over it, and index into each tuple inside it.

Querying a single column

You do not have to select everything. To pull back just the major names, narrow the select statement:

query = "select Major from recent_grads;"
cursor.execute(query)
majors = cursor.fetchall()

print(majors[0:3])

Output:

[('PETROLEUM ENGINEERING',), ('MINING AND MINERAL ENGINEERING',), ('METALLURGICAL ENGINEERING',)]

Notice that each row is still a tuple, even though it holds only one value. That is why you see the trailing comma in ('PETROLEUM ENGINEERING',) — it is Python’s way of writing a one-element tuple. To get the plain string out, index into the tuple:

first_major = majors[0][0]
print(first_major)

Output:

PETROLEUM ENGINEERING

Here majors[0] is the first row (a tuple) and majors[0][0] is the first value inside it.

Chaining execute and fetch

execute() returns the cursor itself, so you can chain the fetch call directly onto it. This one-liner does the same work as the two-step version above:

majors = cursor.execute("select Major from recent_grads;").fetchall()
print(majors[0:3])

This is a common, compact style you will see often.


Two Shortcuts: Skipping the Cursor and Limiting Results

Running a query straight off the connection

Creating a cursor explicitly is the clearest way to see what is happening, but sqlite3 lets you skip that step. The Connection object has its own execute() method that creates a cursor for you behind the scenes:

conn = sqlite3.connect("jobs.db")
query = "select * from recent_grads;"
results = conn.execute(query).fetchall()

print(results[0:2])

Output:

[(1, 2419, 'PETROLEUM ENGINEERING', 2339.0, 2057.0, 282.0, 'Engineering', 0.120564344, 36, 1976, 1849, 270, 1207, 37, 0.018380527, 110000, 95000, 125000, 1534, 364, 193),
 (2, 2416, 'MINING AND MINERAL ENGINEERING', 756.0, 679.0, 77.0, 'Engineering', 0.101851852, 7, 640, 556, 170, 388, 85, 0.117241379, 75000, 55000, 90000, 350, 257, 50)]

There is no visible cursor here — sqlite3 made one for you. Both styles are valid; use whichever reads more clearly in your code.

Fetching a specific number of rows

Real result sets can be huge, and you often do not want every row. The cursor gives you two more ways to fetch:

  • fetchone() returns the next single row as a tuple.
  • fetchmany(n) returns the next n rows as a list of tuples.

Each cursor keeps an internal counter that remembers where you are in the result set. Every fetch advances that counter, so successive calls pick up where the last one left off:

cursor = conn.cursor()
cursor.execute("select Major, Major_category from recent_grads;")

first_result = cursor.fetchone()
second_result = cursor.fetchone()
next_five_results = cursor.fetchmany(5)

print("First:", first_result)
print("Second:", second_result)
print("Next five count:", len(next_five_results))

Output:

First: ('PETROLEUM ENGINEERING', 'Engineering')
Second: ('MINING AND MINERAL ENGINEERING', 'Engineering')
Next five count: 5

The first fetchone() returns row 1 and advances the counter; the second fetchone() returns row 2; then fetchmany(5) returns rows 3 through 7. When there are no rows left, fetchone() returns None and fetchmany() returns an empty list.

Tip

Use fetchmany() when a table is large enough that loading every row at once would waste memory. You can fetch a batch, process it, and fetch the next batch in a loop.


Closing the Connection

Because SQLite locks the file while you are connected, you should always close the connection when you finish. Closing frees the database for other processes and prevents file-locking problems — something that matters a lot when you share a database with teammates or run code in production.

You close a connection by calling the Connection’s close() method:

conn = sqlite3.connect("jobs.db")

# ... run your queries ...

conn.close()

After you close the connection, any attempt to query through a cursor tied to it raises an error:

conn.close()
cursor.execute("select * from recent_grads;")

Output:

ProgrammingError: Cannot operate on a closed database.

That error is a reminder, not a disaster — it simply means the connection is no longer open.

The full workflow in one place

Putting all five steps together, a complete script looks like this:

import sqlite3

# 1. Connect
conn = sqlite3.connect("jobs.db")

# 2. Cursor
cursor = conn.cursor()

# 3. Execute
query = "select Major, Major_category from recent_grads order by Rank limit 5;"
cursor.execute(query)

# 4. Fetch
top_five = cursor.fetchall()
for row in top_five:
    print(row[0], "—", row[1])

# 5. Close
conn.close()

Output:

PETROLEUM ENGINEERING — Engineering
MINING AND MINERAL ENGINEERING — Engineering
METALLURGICAL ENGINEERING — Engineering
NAVAL ARCHITECTURE AND MARINE ENGINEERING — Engineering
CHEMICAL ENGINEERING — Engineering

This pattern — connect, cursor, execute, fetch, close — is the backbone of querying any database from Python.


Practice Exercises

Try these on your own before checking the hints. Assume jobs.db contains the recent_grads table described above.

Exercise 1: Connect and Fetch All Majors

Connect to jobs.db, write a query that returns every value in the Major column, store the full result set in a variable named majors, and print the first three rows.

# Your code here

Hint

After import sqlite3 and conn = sqlite3.connect("jobs.db"), create a cursor, run select Major from recent_grads;, and call fetchall(). Slice with majors[0:3] to see the first three.

Exercise 2: Read Values Out of Tuples

Using the majors result set from Exercise 1, print the name of the very first major as a plain string (not as a one-element tuple).

# Your code here

Hint

majors[0] is the first row, which is a tuple. The first value inside it is majors[0][0].

Exercise 3: Fetch in Batches

Run a query that returns the Major and Major_category columns. Fetch the first row with fetchone(), then fetch the next five rows with fetchmany(5), and print how many rows the second call returned.

# Your code here

Hint

The cursor’s internal counter advances with each fetch. After one fetchone(), a fetchmany(5) returns rows 2 through 6. Use len() on the result to count them.

Exercise 4: The Whole Workflow

Connect to jobs.db, write and run a query that returns every Major in reverse alphabetical order (Z to A), store the full result set in reverse_alphabetical, and close the connection.

# Your code here

Hint

Add order by Major desc to your select to sort Z to A. You can do it in one line with conn.cursor().execute(query).fetchall(), then remember to call conn.close() at the end.


Summary

You have learned the complete cycle for querying a SQLite database from Python. You connect with sqlite3.connect(), create a cursor, run a SQL query string with execute(), pull rows back with fetchall(), fetchone(), or fetchmany(), and close the connection with close(). Along the way you saw that SQLite returns each row as an immutable tuple, and that you read individual values out of those tuples with bracket indexing.

Key Concepts

  • sqlite3 — Python’s built-in library for working with SQLite databases; no installation needed.
  • SQLite — a serverless database that stores an entire database in a single file on disk, which it locks while connected.
  • Connection — the object returned by sqlite3.connect() that represents your link to the database.
  • Cursor — the object that runs queries, parses results, and holds them; created with conn.cursor().
  • Tuple — an immutable, ordered sequence of values; SQLite represents each result row as one.
  • execute() — runs a SQL query string against the database.
  • fetchall() / fetchone() / fetchmany(n) — retrieve all rows, the next row, or the next n rows.
  • close() — releases the connection so other processes can use the database.

Why This Matters

Most real data does not live in tidy CSV files — it lives in databases. Knowing how to reach into one from Python means you can pull exactly the rows you need into a script, then clean, analyze, and visualize them with the rest of the Python ecosystem. The connect-cursor-execute-fetch-close pattern you learned here is nearly identical across other database libraries too, so the habits you are building transfer directly to PostgreSQL, MySQL, and beyond.


Next Steps

Right now your results come back as lists of tuples, which are fine for small jobs but awkward for serious analysis. In the next lesson you will load query results directly into a pandas DataFrame, where filtering, grouping, and plotting all become easy.

Continue to Lesson 2 - From SQL Results to pandas DataFrames

Load query results straight into a pandas DataFrame for easy analysis and plotting

Back to Module Overview

Return to the Querying Databases from Python module overview


Continue Building Your Skills

You just connected two of the most important tools in data work: SQL for asking questions and Python for everything that comes after. Run these examples against your own .db files, experiment with different queries, and get comfortable with the five-step workflow. The more natural it feels, the more powerful your data analysis becomes.