Lesson 7 - Guided Project: Sales Analytics for Northwind Traders
On this page
- Welcome to the Northwind Guided Project
- Setting Up and Exploring the Data
- Task 1: Rank Employee Sales Performance
- Task 2: Running Total of Monthly Sales
- Task 3: Month-Over-Month Sales Growth
- Task 4: Identify High-Value Customers
- Task 5: Percentage of Sales for Each Category
- Task 6: Top Products per Category
- Summary
- Next Steps
- Continue Building Your Skills
Welcome to the Northwind Guided Project
You have learned the four families of window functions and the CTE patterns that make them practical. Now you will put it all together. In this guided project you step into the role of a Data Analyst at Northwind Traders, an international gourmet food distributor, and build a sales analytics report that management can act on.
The Northwind database is a classic, realistic sample database — it models customers, orders, products, categories, and employees, just like a real distributor’s system. Working with it gives you the closest thing to on-the-job experience: real joins, real aggregations, and real business questions.
By the end of this project, you will be able to:
- Combine window functions with CTEs and joins to answer layered business questions
- Build employee rankings, running totals, and month-over-month growth rates
- Identify high-value customers and top-performing products
- Interpret query results to support data-driven decisions
Unlike earlier lessons, this is a project: each section gives you a task, the approach, and hints, and you write the query. Treat it as a portfolio piece. Let’s begin.
Data for this lesson
Database: the classic Northwind trading company, loaded into your local PostgreSQL (set up in Lesson 6 of PostgreSQL for Data Engineers).
Tables used: orders, order_details, products, categories, employees, customers
These queries use PostgreSQL syntax — run them in psql or any Postgres client.
Setting Up and Exploring the Data
You can work two ways. The recommended route is a local PostgreSQL install loaded with the Northwind database, queried from a Jupyter notebook so your work is shareable. Alternatively, any PostgreSQL environment with Northwind loaded will do. To run SQL in a Jupyter notebook, prefix each query cell with %%sql on its own line.
Start by listing the tables so you know what you are working with:
SELECT table_name AS name,
table_type AS type
FROM information_schema.tables
WHERE table_schema = 'public'
AND table_type IN ('BASE TABLE', 'VIEW');A note for Jupyter users
The employees table has a photo column whose data type cannot render with %%sql. You do not need it for this project, so drop it near the top of your notebook: ALTER TABLE employees DROP COLUMN photo;
The tables you will use most are orders, order_details, products, categories, customers, and employees. Spend a few minutes exploring them. Joining order_details to orders and products gives you line-item revenue (unit_price * quantity * (1 - discount)), which is the basis for nearly every metric below. Consider saving useful joins as views with CREATE VIEW.
Hint
Explore tables with SELECT * FROM table_name LIMIT 10;. To get full order detail, join orders, order_details, and products. To see who handled each order, join employees and orders on the employee id.
Task 1: Rank Employee Sales Performance
Management wants to recognize top performers and spot employees who may need support. Your first task is to rank employees by their total sales.
The approach: build a CTE that computes each employee’s total sales by joining employees, orders, and order_details and summing line-item revenue. Then, in the main query, apply RANK() over the total in descending order so the top salesperson gets rank 1.
WITH employee_sales AS (
SELECT e.employee_id,
e.first_name || ' ' || e.last_name AS employee_name,
SUM(od.unit_price * od.quantity * (1 - od.discount)) AS total_sales
FROM employees e
JOIN orders o ON e.employee_id = o.employee_id
JOIN order_details od ON o.order_id = od.order_id
GROUP BY e.employee_id, employee_name
)
SELECT employee_name,
total_sales,
RANK() OVER(ORDER BY total_sales DESC) AS sales_rank
FROM employee_sales
ORDER BY sales_rank;RANK() is a good choice here: if two employees tie on total sales, they share a rank. The result is a clean leaderboard management can use to direct rewards and coaching.
Hint
You need three tables joined for total sales. RANK() assigns the same rank to ties and leaves gaps afterward. Order by total sales DESC so rank 1 is the top performer.
Task 2: Running Total of Monthly Sales
Now management wants the bigger picture: how sales accumulate over time. Build a running total of sales by month so they can see the company’s trajectory and spot high- and low-activity periods.
The approach: join orders and order_details, truncate order_date to the month with DATE_TRUNC, sum sales per month, then use SUM() OVER(ORDER BY month) to accumulate.
WITH monthly_sales AS (
SELECT DATE_TRUNC('month', o.order_date) AS month,
SUM(od.unit_price * od.quantity * (1 - od.discount)) AS monthly_total
FROM orders o
JOIN order_details od ON o.order_id = od.order_id
GROUP BY DATE_TRUNC('month', o.order_date)
)
SELECT month,
monthly_total,
SUM(monthly_total) OVER(ORDER BY month) AS running_total
FROM monthly_sales
ORDER BY month;The running_total column grows month by month, exactly the cumulative line a sales dashboard would plot. This is the running-total pattern from Lesson 2, applied to a real ledger.
Hint
DATE_TRUNC('month', order_date) collapses every date to the first of its month, giving you clean monthly buckets. The OVER(ORDER BY month) clause then accumulates the monthly totals into a running sum.
Task 3: Month-Over-Month Sales Growth
A running total shows direction; management also wants the rate of change. Calculate the month-over-month sales growth as a percentage so they can see where the business accelerated or contracted.
The approach: reuse the monthly totals, use LAG() to fetch the previous month’s total, and compute the percentage change. The growth rate is:
WITH monthly_sales AS (
SELECT DATE_TRUNC('month', o.order_date) AS month,
SUM(od.unit_price * od.quantity * (1 - od.discount)) AS monthly_total
FROM orders o
JOIN order_details od ON o.order_id = od.order_id
GROUP BY DATE_TRUNC('month', o.order_date)
),
with_previous AS (
SELECT month,
monthly_total,
LAG(monthly_total) OVER(ORDER BY month) AS previous_total
FROM monthly_sales
)
SELECT month,
monthly_total,
previous_total,
ROUND(((monthly_total / previous_total) - 1) * 100, 2) AS growth_pct
FROM with_previous
ORDER BY month;The first month has no predecessor, so its growth_pct is NULL. Every later month shows the percentage change from the month before — the LAG() pattern from Lesson 5 turned into a growth metric.
Hint
LAG(monthly_total) OVER(ORDER BY month) gives the previous month’s figure. Compute the growth as (current / previous - 1) * 100. Chain two CTEs — one for the monthly totals, one for the lagged value — to keep the logic readable.
Task 4: Identify High-Value Customers
Management wants to target high-value customers with promotions. Identify customers with above-average order values — the ones placing larger-than-typical orders.
The approach: in a CTE, compute the value of each order per customer. In the main query, compare each order to the overall average order value with AVG() OVER() and tag it using a CASE statement.
WITH order_values AS (
SELECT o.customer_id,
o.order_id,
SUM(od.unit_price * od.quantity * (1 - od.discount)) AS order_value
FROM orders o
JOIN order_details od ON o.order_id = od.order_id
GROUP BY o.customer_id, o.order_id
)
SELECT customer_id,
order_id,
order_value,
CASE
WHEN order_value > AVG(order_value) OVER() THEN 'Above Average'
ELSE 'Average/Below Average'
END AS value_category
FROM order_values
ORDER BY order_value DESC;The AVG(order_value) OVER() computes the average across all orders without collapsing rows, so each order can be compared to it directly — the empty-OVER() pattern from Lesson 1. As an extension, count how many ‘Above Average’ orders each customer placed to find your most consistently valuable buyers.
Hint
AVG(order_value) OVER() with an empty OVER() gives the average over every order while keeping all rows. The CASE statement then compares each order to that average. Recall from Lesson 3 that you cannot put the window function in WHERE — categorize in SELECT instead.
Task 5: Percentage of Sales for Each Category
To guide inventory and marketing, management wants to know each product category’s share of total sales. This reveals which categories drive the business.
The approach: in a CTE, total sales per category by joining categories, products, order_details, and orders. In the main query, divide each category’s total by the company-wide total — computed with a window SUM() OVER() — and multiply by 100.
WITH category_sales AS (
SELECT c.category_name,
SUM(od.unit_price * od.quantity * (1 - od.discount)) AS total_sales
FROM categories c
JOIN products p ON c.category_id = p.category_id
JOIN order_details od ON p.product_id = od.product_id
JOIN orders o ON od.order_id = o.order_id
GROUP BY c.category_name
)
SELECT category_name,
total_sales,
ROUND(total_sales / SUM(total_sales) OVER() * 100, 2) AS pct_of_total
FROM category_sales
ORDER BY pct_of_total DESC;SUM(total_sales) OVER() totals every category’s sales into one grand total, and dividing each category by it yields its percentage share. This is the “subtotal over grand total” pattern, done in a single query.
Hint
Compute per-category totals in the CTE, then divide each by SUM(total_sales) OVER() (the company-wide total) and multiply by 100. The window SUM over no partition gives you the denominator without a second query.
Task 6: Top Products per Category
Finally, drill into each category for the top three products by sales. These star performers should stay in stock and feature prominently in marketing.
The approach: total sales per product in a CTE, then use ROW_NUMBER() partitioned by category and ordered by sales descending. Filter to row numbers 1 through 3.
WITH product_sales AS (
SELECT c.category_name,
p.product_name,
SUM(od.unit_price * od.quantity * (1 - od.discount)) AS total_sales
FROM categories c
JOIN products p ON c.category_id = p.category_id
JOIN order_details od ON p.product_id = od.product_id
GROUP BY c.category_name, p.product_name
),
ranked AS (
SELECT category_name,
product_name,
total_sales,
ROW_NUMBER() OVER(PARTITION BY category_name
ORDER BY total_sales DESC) AS product_rank
FROM product_sales
)
SELECT category_name, product_name, total_sales, product_rank
FROM ranked
WHERE product_rank <= 3
ORDER BY category_name, product_rank;PARTITION BY category_name restarts the numbering for each category, so product_rank <= 3 keeps the top three per category. This is the top-N-per-group pattern from Lesson 4, applied to a real product catalog. ROW_NUMBER() guarantees exactly three rows per category even if products tie on sales.
Hint
PARTITION BY category_name makes the row numbering restart for each category. Because window functions cannot go in WHERE, compute the row number in a CTE and filter WHERE product_rank <= 3 in the outer query.
Summary
You built an end-to-end sales analytics report on the Northwind database, weaving together everything from this module. You ranked employees with RANK(), accumulated monthly sales with a running total, measured growth with LAG(), flagged high-value customers against a windowed average, computed category share with SUM() OVER(), and surfaced the top products per category with ROW_NUMBER() — all organized with CTEs and real joins.
Key Concepts
- Window functions + CTEs — compute a windowed value in a CTE, then filter or transform it in the outer query.
- Running total —
SUM() OVER(ORDER BY month)over monthly buckets built withDATE_TRUNC. - Growth rate —
LAG()to fetch the prior period, then a percentage-change formula. - Share of total — divide a per-group total by
SUM(...) OVER()(the grand total). - Top-N per group —
ROW_NUMBER() OVER(PARTITION BY ...)filtered to the first N rows.
Why This Matters
This is what analytical SQL looks like on the job. Leaderboards, cumulative trend lines, growth rates, customer segmentation, category mix, and top-seller lists are the standard contents of a real sales dashboard — and you produced every one of them with window functions and CTEs. A project like this, run end to end on a realistic database, is exactly the kind of work that demonstrates job-ready SQL skill to an employer or in a portfolio.
Next Steps
Continue to Querying Databases from Python
Move from pure SQL into Python: connect to a database and run queries with sqlite3.
Back to Module Overview
Return to the Window Functions module overview
Continue Building Your Skills
You have completed the Window Functions module — congratulations. To keep growing, extend this project with your own questions: find the top 20% of customers by purchase volume, compare each employee to the company average, or compute cumulative sales per category over time. The Northwind schema is rich enough to support dozens of analyses. Every query you write here sharpens the skills that data teams rely on every day.