Lesson 1 - Aggregate Functions: COUNT, SUM, AVG, and More
Welcome to Aggregate Functions
In earlier lessons you learned the basics of querying. You can display columns with the SELECT clause, filter rows with the WHERE clause, and refine values with functions like ROUND. Those tools all work one row at a time. In this lesson you take a different step: instead of looking at individual rows, you summarize a whole column into a single number.
That is what aggregate functions do. They let you answer questions about your entire dataset at once — how many invoices are there?, what did we sell in total?, what is the average order size? By the end of this lesson you will be reaching for them naturally.
By the end of this lesson, you will be able to:
- Explain the difference between scalar and aggregate functions
- Use
SUMandAVGto total and average a numeric column - Use
MINandMAXto find extreme values in numeric and text columns - Use
COUNTto count rows and non-empty values - Rename a result column with the
ASclause
No prior experience with aggregates is needed — only the SELECT, FROM, WHERE, and LIMIT clauses you already know. Let’s begin.
Data for this lesson
Database: chinook.db — a digital music store (SQLite).
Tables used: invoice, track
Open it in any SQLite client — DB Browser for SQLite, the sqlite3 CLI, or Python’s sqlite3 — to run every query yourself.
Scalar vs. Aggregate Functions
You have already met a scalar function: ROUND. A scalar function operates on each row of a column independently and returns one output per row. If a column has 600 rows, ROUND gives you 600 rounded values.
An aggregate function is different. It operates on a whole collection of rows at once and returns a single value. If a column has 600 rows, SUM gives you one total.
| Scalar functions | Aggregate functions | |
|---|---|---|
| Applies to | each row or value of a column | a collection of a column’s rows |
| Output | one output per row | a single output |
Aggregate functions let you summarize a table with statistics like the sum, mean, minimum, and maximum. Those statistics turn a long table into insight.
The dataset for this lesson
Every example here runs against the Chinook database, which models a fictional digital music store — think of an online store like iTunes. One of its core tables is the invoice table, which stores the total purchase amount of each customer’s invoice. Here are the first three rows for a few columns:
| invoice_id | customer_id | invoice_date | billing_country | total |
|---|---|---|---|---|
| 1 | 2 | 2021-01-01 00:00:00 | Germany | 1.98 |
| 2 | 4 | 2021-01-02 00:00:00 | Norway | 3.96 |
| 3 | 8 | 2021-01-03 00:00:00 | Belgium | 5.94 |
As a quick review, here is the query that displays those rows. It uses SELECT, FROM, and LIMIT — nothing new yet:
SELECT invoice_id, customer_id, invoice_date, billing_country, total
FROM invoice
LIMIT 3;Throughout this lesson we will sometimes refer to a tiny three-row illustrative table called example, holding three made-up invoices with the totals 15.84, 9.90, and 1.98 and the billing countries USA, Canada, and France. Using a small table with round numbers makes it easy to check the math by hand before running the same query on the full invoice table.
Summing a Column with SUM
The first aggregate function is SUM. It adds up every value in a numeric column and returns the total as a single number.
A worked example
Using the three-row example table above, you can find the overall purchase amount by summing the total column:
SELECT SUM(total)
FROM example;SQL computes 15.84 + 9.90 + 1.98 and returns one value:
SUM(total)
----------
27.72Notice the column heading is SUM(total) — SQL names the result after the operation you performed. That is rarely what you want in a report. To give the result a clear name, use the AS clause:
SELECT SUM(total) AS overall_sale
FROM example;overall_sale
------------
27.72Summing the real table
The same query works on the full invoice table — you just point it at invoice instead of example:
SELECT SUM(total) AS overall_sale
FROM invoice;overall_sale
------------
2328.6This adds up the total column across all 412 invoices in the store and returns a single grand total of 2328.60. The point to remember: SUM collapses many rows into one number.
Averaging a Column with AVG
The AVG aggregate function computes the average (also called the mean) of a numeric column. It works just like SUM: you can apply it to any numeric column.
Under the hood, an average is the sum of the values divided by how many there are. For the example table, that is (15.84 + 9.90 + 1.98) / 3, which equals 9.24.
To compute the average purchase amount across the whole invoice table:
SELECT AVG(total) AS avg_sale
FROM invoice;avg_sale
--------
5.651...SUM and AVG are close siblings — both reduce a numeric column to one value. The difference is simply whether you want the total or the typical value.
Finding Extremes with MIN and MAX
The MIN and MAX aggregate functions return the smallest and largest values in a column.
Numeric columns
Looking at the three invoices in the example table, the lowest total purchase amount is 1.98. You can find it with MIN:
SELECT MIN(total) AS min_sale
FROM example;min_sale
--------
1.98MAX works the same way and would return the highest value, 15.84. To find the largest invoice total across the entire store:
SELECT MAX(total) AS max_sale
FROM invoice;This returns the single biggest purchase amount in the total column.
Text columns
MIN and MAX are not limited to numbers — they also work on text columns, where “smallest” and “largest” mean alphabetical order.
This is useful when you want the first or last value alphabetically. In the example table, the billing countries are USA, Canada, and France. Sorted alphabetically, the first is Canada — even though it is not the first row in the table. MIN finds it:
SELECT MIN(billing_country) AS first_billing_country
FROM example;first_billing_country
---------------------
CanadaMAX returns the last country alphabetically. Running it on the full invoice table:
SELECT MAX(billing_country) AS last_billing_country
FROM invoice;This returns the billing country that comes last in alphabetical order across every invoice in the store.
MIN and MAX on text
For text, MIN and MAX compare strings alphabetically. The “minimum” is the value closest to the start of the alphabet (A), and the “maximum” is the value closest to the end (Z). This has nothing to do with the order the rows appear in the table.
Counting Rows with COUNT
COUNT is the odd one out. The other four functions operate on a column’s values; COUNT is most often used to count rows.
Counting all rows
To count every row in a table, use COUNT(*). The * is a wildcard meaning “all rows,” so you do not name a column:
SELECT COUNT(*) AS num_row
FROM example;num_row
-------
3Run the same idea on the real invoice table and you learn how many invoices the store has:
SELECT COUNT(*) AS num_rows
FROM invoice;num_rows
--------
412Counting non-empty values
Sometimes a column has missing data. SQL stores a missing value as NULL, which means empty. Consider the track table, which lists every song available in the store:
| track_id | name | album_id | composer | milliseconds | unit_price |
|---|---|---|---|---|---|
| 1 | For Those About To Rock (We Salute You) | 1 | Angus Young, Malcolm Young, Brian Johnson | 343719 | 0.99 |
| 63 | Desafinado | 8 | 185338 | 0.99 | |
| 64 | Garota De Ipanema | 8 | 285048 | 0.99 |
Notice the composer value is empty in the second and third rows — that is a NULL. Counting all rows in this table gives the full track count:
SELECT COUNT(*) AS num_track
FROM track;num_track
---------
3503But what if you want to count only the rows where a particular column actually has a value? Then you put the column name inside COUNT instead of *. Counting the name column returns the same number:
SELECT COUNT(name) AS num_track_name
FROM track;num_track_name
--------------
3503Because name is never empty, the count matches COUNT(*). The composer column is a different story — many tracks have no listed composer:
SELECT COUNT(composer) AS num_composers
FROM track;This returns a number smaller than 3503, because COUNT(composer) skips the rows where composer is NULL.
COUNT(*) vs. COUNT(column)
COUNT(*) counts every row in the table. COUNT(column) counts only the rows where that specific column is not NULL. When the two numbers differ, you have found missing values in that column.
The General Pattern
Every query in this lesson follows the same simple template:
SELECT aggregate_function(column) AS new_name
FROM table;You pick a function (SUM, AVG, MIN, MAX, or COUNT), apply it to a column, and optionally rename the result with AS. That one pattern unlocks a huge range of questions about your data.
Practice Exercises
Try these on the Chinook database before checking the hints. Each uses a single aggregate function.
Exercise 1: Total runtime of every track
The track table has a milliseconds column holding each track’s runtime. Write a query that sums that column and names the result total_runtime.
-- Your code hereHint
Use SUM(milliseconds) and rename it with AS total_runtime. The table is track.
Exercise 2: Cheapest and most expensive track
Using the unit_price column of the track table, write a single query that returns both the minimum price as min_price and the maximum price as max_price.
-- Your code hereHint
You can list two aggregates in one SELECT, separated by a comma: SELECT MIN(unit_price) AS min_price, MAX(unit_price) AS max_price FROM track;
Exercise 3: How many tracks have a composer?
Write a query that counts how many rows in the track table have a non-empty composer value, naming the result num_composers. Then write a second query counting all rows. Are the numbers the same? Why or why not?
-- Your code hereHint
Use COUNT(composer) for non-empty values and COUNT(*) for all rows. The difference is the number of tracks missing a composer.
Summary
In this lesson you learned to summarize an entire column into a single value using SQL’s five core aggregate functions. You saw how they differ from scalar functions, how to rename results with AS, and how COUNT behaves differently from the rest.
Key Concepts
- Aggregate function — a function that operates on a whole collection of rows and returns a single value.
- Scalar function — a function that operates on each row independently and returns one output per row (e.g.,
ROUND). - SUM — adds all values in a numeric column.
- AVG — computes the average (mean) of a numeric column.
- MIN / MAX — return the smallest/largest value; for text columns this means first/last alphabetically.
- COUNT(*) — counts every row in a table.
- COUNT(column) — counts only the rows where that column is not
NULL. - AS — renames a result column so reports are readable.
Why This Matters
Aggregate functions are the bridge from raw data to insight. A table of 600 invoices is just a list until you ask “what did we sell in total?” or “what is the average order?” — and those answers come from a single aggregate query. Almost every dashboard, report, and KPI you will ever build starts with COUNT, SUM, or AVG. Mastering them now sets you up for everything that follows.
Next Steps
You can now compute one summary value at a time. Next you will combine several aggregates in one query and mix them with WHERE, scalar functions, and arithmetic to build richer statistics.
Continue to Lesson 2 - Building Summary Statistics
Combine aggregates with WHERE, scalar functions, and arithmetic to compute richer summary metrics
Back to Module Overview
Return to the Summarizing Data with SQL module overview
Continue Building Your Skills
You have taken your first step from reading rows to reading insight. The five functions you learned here — SUM, AVG, MIN, MAX, and COUNT — are among the most-used tools in all of SQL. Keep practicing them on the Chinook tables, and notice how naturally a business question turns into a one-line query.