Lesson 4 - Ranking Functions: ROW_NUMBER, RANK, and NTILE

Welcome to Ranking Functions

Some of the most common analytical questions are about order and position: Which observation is best, second best, or worst? What are the top three products in each category? Which customers fall in the top 25% by spending? These are hard to answer with ordinary SQL, but ranking window functions make them straightforward.

In this lesson you will meet four ranking functions. The first three — ROW_NUMBER(), RANK(), and DENSE_RANK() — assign a sequential number to each row. The fourth, NTILE(), splits rows into a fixed number of buckets. The differences between them are subtle but important, and choosing the right one is the whole game.

By the end of this lesson, you will be able to:

Explain the purpose of ranking window functions
Use ROW_NUMBER(), RANK(), and DENSE_RANK() and describe how each handles ties
Use NTILE() to divide rows into equal-sized buckets
Combine ranking functions with CTEs to solve top-N and “nth place” problems

This lesson assumes you are comfortable with OVER(), PARTITION BY, and ORDER BY. Let’s dive in.

Data for this lesson

Engine: PostgreSQL — the examples print small sample tables inline.

Tables used: trips, employees

Want to run them yourself? Download window_practice.db and use any SQLite client (SQLite supports window functions too), or recreate the tables in PostgreSQL with psql.

How Ranking Functions Use the Window

All four ranking functions share the same windowing rules:

PARTITION BY is optional. Without it, the whole result set is one partition.
ORDER BY is technically optional but strongly required in practice — it determines the rank. Without an ordering, the ranks are meaningless and unpredictable.
Framing does not apply. Ranking is always relative to the entire partition.

The simplest form is:

RANKING_FUNCTION() OVER (
    [PARTITION BY column]
    ORDER BY column
)

ROW_NUMBER(), RANK(), and DENSE_RANK() take no arguments, but the empty parentheses are required. NTILE(n) takes the number of buckets.

For the examples, we will use a trips table of twelve bike trips. Each row has the usual trip details plus a rider_rating column from 2 to 5 indicating how satisfied the rider was.

The ROW_NUMBER() Function

ROW_NUMBER() assigns a unique integer to every row in the partition, starting at 1, in the order set by ORDER BY. Even rows with identical values get distinct numbers.

SELECT start_date, bike_number, member_type, rider_rating,
       ROW_NUMBER() OVER(ORDER BY rider_rating)
  FROM trips;

start_date          | bike_number | member_type | rider_rating | row_number
--------------------+-------------+-------------+--------------+-----------
2017-10-01 05:01:00 | W23254      | Member      | 2            | 1
2017-10-01 05:01:00 | W00143      | Member      | 3            | 2
2017-10-01 03:08:00 | W23272      | Member      | 3            | 3
2017-10-04 05:21:00 | W22051      | Casual      | 3            | 4
2017-10-05 08:08:00 | W20184      | Member      | 3            | 5
2017-10-05 08:08:00 | W00895      | Casual      | 4            | 6
...                 |             |             |              |
2017-10-04 08:30:00 | W22517      | Casual      | 5            | 12

Notice that the four rows with rider_rating = 3 still get distinct numbers (2, 3, 4, 5). Add PARTITION BY to restart the numbering within each group:

SELECT start_date, bike_number, member_type, rider_rating,
       ROW_NUMBER() OVER(PARTITION BY member_type
                         ORDER BY rider_rating)
  FROM trips;

Now the Casual partition runs 1 through 5 and the Member partition restarts at 1.

ROW_NUMBER() is nondeterministic on ties

When the ORDER BY columns are not unique, there is no guarantee ROW_NUMBER() assigns the same number to tied rows on every run. For reproducible results, order by something unique (or add a tiebreaker column to the ORDER BY).

The RANK() Function

RANK() looks like ROW_NUMBER() but treats ties differently: rows with equal ORDER BY values get the same rank, and the next distinct value skips ahead, leaving gaps.

To find the highest-rated trips, order by rider_rating descending:

SELECT start_date, bike_number, member_type, rider_rating,
       RANK() OVER(ORDER BY rider_rating DESC)
  FROM trips;

start_date          | bike_number | member_type | rider_rating | rank
--------------------+-------------+-------------+--------------+-----
2017-10-04 08:30:00 | W22517      | Casual      | 5            | 1
2017-10-04 04:58:00 | W23052      | Casual      | 5            | 1
2017-10-03 12:00:00 | W22965      | Casual      | 5            | 1
2017-10-02 03:30:00 | W21096      | Member      | 4            | 4
2017-10-03 09:34:00 | W20095      | Member      | 4            | 4
...                 |             |             |              |

Three trips share rating 5 and all rank 1. The next rating (4) ranks 4, not 2, because three rows already occupy positions 1 through 3. RANK() answers “how many rows are ahead of this one?” — so the ranks can skip numbers.

PARTITION BY resets the ranking per group:

SELECT start_date, bike_number, member_type, rider_rating,
       RANK() OVER(PARTITION BY member_type
                   ORDER BY rider_rating DESC)
  FROM trips;

The Casual and Member partitions each restart at rank 1.

The DENSE_RANK() Function

DENSE_RANK() also gives tied rows the same rank, but it does not skip numbers. After a tie, the next distinct value gets the very next integer.

SELECT start_date, bike_number, member_type, rider_rating,
       DENSE_RANK() OVER(ORDER BY rider_rating DESC)
  FROM trips;

start_date          | bike_number | member_type | rider_rating | dense_rank
--------------------+-------------+-------------+--------------+-----------
2017-10-04 08:30:00 | W22517      | Casual      | 5            | 1
2017-10-04 04:58:00 | W23052      | Casual      | 5            | 1
2017-10-03 12:00:00 | W22965      | Casual      | 5            | 1
2017-10-02 03:30:00 | W21096      | Member      | 4            | 2
2017-10-03 09:34:00 | W20095      | Member      | 4            | 2
2017-10-04 06:07:00 | W23268      | Member      | 4            | 2
2017-10-05 08:08:00 | W00895      | Casual      | 4            | 2
2017-10-04 05:21:00 | W22051      | Casual      | 3            | 3
...                 |             |             |              |
2017-10-01 05:01:00 | W23254      | Member      | 2            | 4

The three top trips rank 1; the rating-4 trips rank 2 (no gap). DENSE_RANK() answers “how many distinct values are ahead of this one?”

Comparing the Three

Running all three side by side over rider_rating DESC makes the difference unmistakable:

SELECT start_date, bike_number, member_type, rider_rating,
       ROW_NUMBER() OVER(ORDER BY rider_rating DESC),
       RANK() OVER(ORDER BY rider_rating DESC),
       DENSE_RANK() OVER(ORDER BY rider_rating DESC)
  FROM trips;

rider_rating | row_number | rank | dense_rank
-------------+------------+------+-----------
5            | 1          | 1    | 1
5            | 2          | 1    | 1
5            | 3          | 1    | 1
4            | 4          | 4    | 2
4            | 5          | 4    | 2
4            | 6          | 4    | 2
4            | 7          | 4    | 2
3            | 8          | 8    | 3
3            | 9          | 8    | 3
3            | 10         | 8    | 3
3            | 11         | 8    | 3
2            | 12         | 12   | 4

ROW_NUMBER is always unique (1–12).
RANK repeats on ties and skips numbers (1, 1, 1, 4, …).
DENSE_RANK repeats on ties but never skips (1, 1, 1, 2, …).

Pick ROW_NUMBER() when you need a strict, unique ordering (such as keeping exactly one row per group). Pick RANK() when ties should share a position and you want gaps to reflect how many rows tied. Pick DENSE_RANK() when you want consecutive rank numbers regardless of ties (such as “second highest distinct value”).

The NTILE() Function

NTILE(n) is different from the other three: instead of numbering rows, it divides the ordered rows into n roughly equal buckets and labels each row with its bucket number, from 1 to n. It is the go-to tool for segmenting data into quartiles, deciles, or any equal-sized groups.

SELECT start_date, bike_number, duration, rider_rating,
       NTILE(5) OVER(ORDER BY duration DESC)
  FROM trips;

With twelve rows split into five buckets, the longest trips land in bucket 1 and the shortest in bucket 5. Because 12 does not divide evenly by 5, some buckets get one extra row.

How NTILE distributes uneven rows

When the row count is not divisible by the number of buckets, NTILE() makes the earlier buckets larger. With 12 rows and 5 buckets, buckets 1 and 2 get 3 rows each and buckets 3, 4, and 5 get 2 rows each. The bigger buckets always come first.

A common use is segmentation. Dividing each day’s trips into two buckets by rating — best-rated bikes in bucket 1, worst in bucket 2 — helps the company prioritize maintenance:

SELECT start_date, bike_number, rider_rating,
       NTILE(2) OVER(PARTITION BY EXTRACT(DAY FROM start_date)
                     ORDER BY rider_rating DESC)
  FROM trips;

Solving Real Problems with Ranking and CTEs

Ranking functions truly shine when you combine them with a common table expression (CTE) and then filter on the rank. Remember from Lesson 3 that you cannot put a window function in WHERE, so you compute the rank in a CTE and filter in the outer query.

The Longest Trip Each Day

Partition by day, rank by duration, and keep rank 1:

WITH longtrips AS (
    SELECT *,
           ROW_NUMBER() OVER(PARTITION BY EXTRACT(DAY FROM start_date)
                             ORDER BY duration DESC) AS row_number
      FROM trips
)
SELECT *
  FROM longtrips
 WHERE row_number = 1;

The CTE numbers trips within each day, longest first. The outer query keeps only the top trip per day. Using ROW_NUMBER() guarantees exactly one row per day even if two trips tie on duration.

The Second-Highest-Rated Trip Each Day

Here DENSE_RANK() is the right choice, because “second highest distinct rating” means rank 2 with no gaps:

WITH rating AS (
    SELECT start_date, duration, bike_number, rider_rating,
           DENSE_RANK() OVER(PARTITION BY EXTRACT(DAY FROM start_date)
                             ORDER BY rider_rating DESC) AS rank
      FROM trips
)
SELECT *
  FROM rating
 WHERE rank = 2;

This returns each day’s second-best rating, even when several trips tie for first.

Practice Exercises

Use the trips table.

Exercise 1: Number Trips by Start Date

Return start_date, bike_number, member_type, rider_rating, and a row number assigned in order of start_date.

-- Your code here

Hint

Use ROW_NUMBER() OVER(ORDER BY start_date). No PARTITION BY is needed for a single sequence across all rows.

Exercise 2: Longest Trips per Day

Return all columns plus a dense rank of each day’s trips by duration (longest = rank 1), aliased trip_dense_rank, then keep only the rows where that rank equals 1.

Hint

Put DENSE_RANK() OVER(PARTITION BY EXTRACT(DAY FROM start_date) ORDER BY duration DESC) AS trip_dense_rank in a CTE, then filter WHERE trip_dense_rank = 1 in the outer query.

Exercise 3: Bonus Buckets

Split riders into four groups by duration using NTILE(4), then add 1 to the bucket number to compute a bonus — the first group earns USD 5, the last earns USD 2. Return start_date, bike_number, member_type, duration, and the bonus, sorted by bonus descending.

Hint

Use NTILE(4) OVER(ORDER BY duration) + 1 AS bonus. The shortest trips fall in bucket 4 (bonus 5) and the longest in bucket 1 — adjust the ordering to match the reward you want, then ORDER BY bonus DESC.

Summary

You learned four ranking window functions and, crucially, how each treats ties. ROW_NUMBER() numbers every row uniquely; RANK() shares a rank on ties and leaves gaps; DENSE_RANK() shares a rank but never skips; and NTILE(n) splits rows into equal buckets. Combined with a CTE, these functions answer top-N and nth-place questions cleanly, since you compute the rank first and filter on it afterward.

Key Concepts

ROW_NUMBER() — a unique sequential integer per row; ideal for “exactly one row per group.”
RANK() — equal values share a rank and the next value skips ahead, so ranks can have gaps.
DENSE_RANK() — equal values share a rank but the next value is consecutive, with no gaps.
NTILE(n) — divides ordered rows into n roughly equal buckets, filling earlier buckets first when rows do not divide evenly.
CTE + filter — compute a rank in a CTE, then filter on it in the outer query to extract top-N or nth-place rows.

Why This Matters

Ranking is one of the most requested analytics patterns: leaderboards, top sellers per region, percentile segments for marketing, and “second-best” comparisons all rely on these functions. Knowing the difference between RANK() and DENSE_RANK() prevents subtle off-by-one bugs, and the CTE-plus-filter pattern is the standard professional way to surface top-N rows. You will use these constantly in reporting and data science work.

Next Steps

Continue to Lesson 5 - Offset Functions: LAG, LEAD, and FIRST_VALUE

Compare each row to its neighbors with LAG() and LEAD(), and pull the first, last, or nth value in a window.

Back to Module Overview

Return to the Window Functions module overview

Continue Building Your Skills

Practice predicting which ranking function a question calls for. “How many rows beat this one?” points to RANK(). “What is the third distinct value?” points to DENSE_RANK(). “Give me exactly one winner per group” points to ROW_NUMBER(). “Split customers into quartiles” points to NTILE(). The faster you map a business question to the right function, the more naturally these queries will come.

Lesson 3 - Selecting and Slicing Data

Lesson 4 - Vector Operations and Calculations

Courses

DATATWEETS

Title here

Lesson 4 - Ranking Functions: ROW_NUMBER, RANK, and NTILE

Welcome to Ranking Functions

How Ranking Functions Use the Window

The ROW_NUMBER() Function

The RANK() Function

The DENSE_RANK() Function

Comparing the Three

The NTILE() Function

Solving Real Problems with Ranking and CTEs

The Longest Trip Each Day

The Second-Highest-Rated Trip Each Day

Practice Exercises

Exercise 1: Number Trips by Start Date

Exercise 2: Longest Trips per Day

Exercise 3: Bonus Buckets

Summary

Key Concepts

Why This Matters

Next Steps

Continue to Lesson 5 - Offset Functions: LAG, LEAD, and FIRST_VALUE

Back to Module Overview

Continue Building Your Skills

Lesson 4 - Ranking Functions: ROW_NUMBER, RANK, and NTILE

Welcome to Ranking Functions#

How Ranking Functions Use the Window#

The ROW_NUMBER() Function#

The RANK() Function#

The DENSE_RANK() Function#

Comparing the Three#

The NTILE() Function#

Solving Real Problems with Ranking and CTEs#

The Longest Trip Each Day#

The Second-Highest-Rated Trip Each Day#

Practice Exercises#

Exercise 1: Number Trips by Start Date#

Exercise 2: Longest Trips per Day#

Exercise 3: Bonus Buckets#

Summary#

Key Concepts#

Why This Matters#

Next Steps#

Continue to Lesson 5 - Offset Functions: LAG, LEAD, and FIRST_VALUE

Back to Module Overview

Continue Building Your Skills#

Welcome to Ranking Functions

How Ranking Functions Use the Window

The ROW_NUMBER() Function

The RANK() Function

The DENSE_RANK() Function

Comparing the Three

The NTILE() Function

Solving Real Problems with Ranking and CTEs

The Longest Trip Each Day

The Second-Highest-Rated Trip Each Day

Practice Exercises

Exercise 1: Number Trips by Start Date

Exercise 2: Longest Trips per Day

Exercise 3: Bonus Buckets

Summary

Key Concepts

Why This Matters

Next Steps

Continue Building Your Skills