Lesson 4 - Selecting Data with .iloc[]

Position-Based Selection

You learned .loc[] for label-based selection. Now you will learn .iloc[]—position-based selection using integer indices, just like NumPy arrays.

By the end of this lesson, you will be able to:

Select data by integer position using .iloc[]
Use negative indices to count from the end
Understand the difference between .loc[] and .iloc[]
Choose the right selection method for your task
Apply NumPy-style indexing to DataFrames

Position-based selection is useful when you need to work with data by location rather than by name. Let’s explore when and how to use it.

Understanding .iloc[]

The .iloc[] accessor uses integer positions (0-based indexing) to select data. This works exactly like NumPy arrays and Python lists.

Syntax:

df.iloc[row_position, column_position]

Key Comparison: .loc[] vs .iloc[]

Feature	`.loc[]`	`.iloc[]`
Uses	Labels (names)	Positions (0, 1, 2, …)
Slicing	Inclusive both ends	Exclusive end (like Python)
Example	`df.loc['TechCorp Global', 'revenues']`	`df.iloc[0, 2]`
Negative Index	Not allowed	Supported (`-1` = last)

Let’s create our sample dataset:

import pandas as pd
import numpy as np

# Create companies dataset
companies = pd.DataFrame({
    'company': ['TechCorp Global', 'FreshMart Inc', 'AutoDrive Motors', 'FirstBank Holdings', 'PowerGen Energy',
                'MediPharm Solutions', 'RetailHub Ltd', 'SkyWings Airlines', 'SteelCore Industries', 'NetLink Telecom'],
    'sector': ['Technology', 'Food', 'Automotive', 'Financials', 'Energy',
               'Healthcare', 'Retail', 'Transportation', 'Materials', 'Technology'],
    'revenues': [125000, 89000, 156000, 234000, 178000,
                 98000, 112000, 187000, 145000, 165000],
    'profits': [12000, 8500, -3000, 45000, 23000,
                15000, 9800, 21000, 18000, 28000],
    'employees': [1200, 890, 2300, 5600, 3400,
                  2100, 4500, 8900, 3200, 6700],
    'country': ['USA', 'USA', 'USA', 'UK', 'Germany',
                'USA', 'UK', 'Germany', 'USA', 'UK']
})

# Set company as index
companies = companies.set_index('company')

print(f"Shape: {companies.shape}")
companies.head()

This creates a DataFrame with 10 companies and 5 columns.

Selecting Rows by Position

Select single or multiple rows using integer positions:

Single Row Selection

# Select first row (position 0)
first_row = companies.iloc[0]

print("First row (position 0):")
print(first_row)
print(f"Type: {type(first_row)}")

Output:

First row (position 0):
sector                Technology
revenues                  125000
profits                    12000
employees                   1200
country                     USA
Name: TechCorp Global, dtype: object
Type: <class 'pandas.core.series.Series'>

The first row is at position 0, and it returns a Series.

Negative Indexing

# Select last row (position -1, like Python lists)
last_row = companies.iloc[-1]

print("Last row:")
print(last_row)

Output:

sector                Technology
revenues                  165000
profits                    28000
employees                   6700
country                      UK
Name: NetLink Telecom, dtype: object

Negative indices count from the end: -1 is the last row, -2 is second-to-last, etc.

Row Slicing

Important Difference

Slicing with .iloc[] is exclusive on the end, just like Python slicing. This is different from .loc[]!

# Select first 3 rows (positions 0, 1, 2)
first_three = companies.iloc[0:3]  # EXCLUSIVE end!

print(f"Shape: {first_three.shape}")
first_three

This gets rows at positions 0, 1, and 2 (3 rows total).

Comparison of Slicing:

# .iloc[] slicing (EXCLUSIVE end)
print("iloc[0:3] gives rows at positions 0, 1, 2")
print(f"Result: {companies.iloc[0:3].shape[0]} rows")

# .loc[] slicing (INCLUSIVE end)
print("\nloc['TechCorp Global':'AutoDrive Motors'] includes both endpoints")
print(f"Result: {companies.loc['TechCorp Global':'AutoDrive Motors'].shape[0]} rows")

Output:

iloc[0:3] gives rows at positions 0, 1, 2
Result: 3 rows

loc['TechCorp Global':'AutoDrive Motors'] includes both endpoints
Result: 3 rows

Visual representation:

Position  Row Label
   0      TechCorp Global       ← iloc[0:3] starts here
   1      FreshMart Inc
   2      AutoDrive Motors    ← iloc[0:3] stops BEFORE position 3
   3      FirstBank Holdings    ← Not included!
   4      PowerGen Energy
   ...

Selecting Specific Rows

# Select rows at positions 0, 3, 7
selected_rows = companies.iloc[[0, 3, 7]]

print("Rows at positions 0, 3, 7:")
selected_rows

Use a list of positions inside double brackets to select non-consecutive rows.

Selecting Rows and Columns Together

Just like NumPy: df.iloc[rows, columns]

Single Element

# Select row 0, column 2
value = companies.iloc[0, 2]

print(f"Value at position [0, 2]: ${value:,}")
print(f"This is '{companies.columns[2]}' for '{companies.index[0]}'")

Output:

Value at position [0, 2]: $12,000
This is 'profits' for 'TechCorp Global'

You need to know that column 2 is ‘profits’—not self-documenting!

Row and Column Slices

# First 3 rows, first 2 columns
subset = companies.iloc[0:3, 0:2]

print(f"Shape: {subset.shape}")
subset

Output:

Shape: (3, 2)
              sector  revenues
company
TechCorp Global  Technology    125000
FreshMart Inc      Food     89000
AutoDrive Motors Automotive   156000

Selecting Specific Columns

# All rows, columns at positions 1 and 2
revenues_profits = companies.iloc[:, [1, 2]]

print(f"Columns at positions 1 and 2: {list(companies.columns[[1, 2]])}")
revenues_profits.head()

The : means “all rows”, and [1, 2] selects columns at positions 1 and 2.

Last N Rows

# Last 5 rows, first 3 columns
last_five = companies.iloc[-5:, :3]

print(f"Shape: {last_five.shape}")
last_five

Negative indices work great for “last N” selections.

Step Slicing

# Every other row
every_other = companies.iloc[::2, :]

print(f"Every other row (step=2): {every_other.shape[0]} rows")
every_other

Output shows rows at positions 0, 2, 4, 6, 8 (every other row).

Specific Rows and Columns

# Rows 1, 3, 5 and columns 0, 2, 4
subset2 = companies.iloc[[1, 3, 5], [0, 2, 4]]

print("Selected rows: 1, 3, 5")
print(f"Selected columns: {list(companies.columns[[0, 2, 4]])}")
subset2

Combine lists of positions for both dimensions.

When to Use .loc[] vs .iloc[]

Choose the right tool for the job:

Use .loc[] When:

You know the label (row name, column name)
You want readable, maintainable code
Working with business logic (“get data for TechCorp Global”)
The selection has semantic meaning

Example:

# Get data for specific company (label-based makes sense)
ali_data = companies.loc['TechCorp Global']
print("TechCorp Global data (using .loc):")
print(ali_data)

This is self-documenting and clear.

Use .iloc[] When:

You know the position (row 5, column 2)
You need first/last N rows
You’re iterating with indices
Working with generic operations (sampling, every Nth row)
Position-based logic is more natural

Example:

# Get first 5 companies (position-based makes sense)
top_5 = companies.iloc[:5]
print("First 5 companies:")
print(top_5)

For “first 5 rows”, position-based selection is natural.

Side-by-Side Comparison

# Both can achieve the same result!

# Method 1: .iloc (position-based)
revenue_iloc = companies.iloc[0, 1]  # Must know column position!

# Method 2: .loc (label-based) - MORE READABLE!
revenue_loc = companies.loc['TechCorp Global', 'revenues']

print(f"Using .iloc[0, 1]: ${revenue_iloc:,}")
print(f"Using .loc['TechCorp Global', 'revenues']: ${revenue_loc:,}")
print(f"\nSame result? {revenue_iloc == revenue_loc}")
print("\nBUT .loc is more readable and maintainable!")

Both work, but .loc[] is clearer.

Real-World Scenario

Imagine your boss asks: “What are the revenues for PowerGen Energy?”

Bad approach (requires manual counting):

# You count manually: PowerGen Energy is row 4, revenues is column 1
answer = companies.iloc[4, 1]  # What if data order changes?

Good approach (clear and robust):

answer = companies.loc['PowerGen Energy', 'revenues']  # Self-documenting!

The .loc[] version is:

Readable: Anyone knows what you’re selecting
Robust: Works even if row order changes
Maintainable: Easy to understand months later

When .iloc[] Makes Sense

# Sample 3 random companies (position-based makes sense)
sample_indices = np.random.choice(companies.shape[0], size=3, replace=False)
random_sample = companies.iloc[sample_indices]

print(f"Random sample of 3 companies (positions {sample_indices}):")
random_sample

For random sampling or algorithmic selection, .iloc[] is appropriate.

Selection Syntax Summary

Complete reference for .iloc[]:

# Single row
df.iloc[0]                 # First row
df.iloc[-1]                # Last row

# Multiple rows
df.iloc[[0, 2, 4]]         # Specific rows
df.iloc[0:5]               # First 5 rows (0,1,2,3,4)
df.iloc[-3:]               # Last 3 rows
df.iloc[::2]               # Every other row

# Single element
df.iloc[0, 2]              # Row 0, column 2

# Rows and columns together
df.iloc[:3, :2]            # First 3 rows, first 2 columns
df.iloc[[0,1], [2,3]]      # Specific rows and columns
df.iloc[-5:, -3:]          # Last 5 rows, last 3 columns
df.iloc[::2, 1:4]          # Every other row, columns 1-3

# All rows or all columns
df.iloc[:, 0]              # All rows, first column
df.iloc[0, :]              # First row, all columns

Visual guide:

DataFrame positions:
         Col0  Col1  Col2  Col3
Row0      a     b     c     d
Row1      e     f     g     h
Row2      i     j     k     l

df.iloc[0, 1]           → b (single value)
df.iloc[0]              → [a, b, c, d] (Series)
df.iloc[[0, 2]]         → Rows 0 and 2 (DataFrame)
df.iloc[0:2]            → Rows 0 and 1 (exclusive!)
df.iloc[:, 1]           → [b, f, j] (Series)
df.iloc[:, [0, 2]]      → Columns 0 and 2 (DataFrame)
df.iloc[0:2, 1:3]       → 2x2 subset (DataFrame)

Practice Exercises

Apply position-based selection with these exercises.

Exercise 1: Basic .iloc[] Operations

Get the second row
Get rows 3, 4, 5 using a slice
Get the last 3 rows

# Your code here

Hint

Remember: .iloc[] slicing is exclusive on the end, just like Python!

Exercise 2: Rows and Columns Together

Get the value at row 2, column 3
Get first 4 rows and first 2 columns
Get all rows, columns 1, 3, and 4

# Your code here

Exercise 3: .loc[] vs .iloc[] Comparison

For the company ‘FirstBank Holdings’:

Get its profits using .loc[]
Get its profits using .iloc[] (find its position first)
Which method is easier and why?

# Your code here

Exercise 4: Advanced Selection

Get every 3rd row (rows 0, 3, 6, 9)
Get the middle 4 rows
Get last 2 rows and last 2 columns

# Your code here

Summary

You now understand position-based selection with .iloc[]. Let’s review the key concepts.

Key Concepts

.iloc[] Uses Integer Positions

0-based indexing like NumPy and Python
Position 0 is first row/column
Position -1 is last row/column

Slicing is Exclusive on End

df.iloc[0:3] gets positions 0, 1, 2 (NOT 3)
Different from .loc[] which is inclusive
Same as Python list slicing

Negative Indices Work

-1 is last row/column
-2 is second-to-last
df.iloc[-5:] gets last 5 rows

Choose the Right Tool

.loc[] for labels → readable, maintainable
.iloc[] for positions → generic operations

Syntax Reference

# Rows
df.iloc[0]              # First row
df.iloc[-1]             # Last row
df.iloc[0:5]            # First 5 rows (0-4)
df.iloc[[0,2,4]]        # Specific rows
df.iloc[::2]            # Every other row

# Rows and columns
df.iloc[0, 2]           # Single value
df.iloc[:3, :2]         # First 3 rows, first 2 cols
df.iloc[[0,1], [2,3]]   # Specific rows and cols
df.iloc[-5:, -3:]       # Last 5 rows, last 3 cols

.loc[] vs .iloc[] Decision Guide

Situation	Use `.loc[]`	Use `.iloc[]`
Know the name	✅ Yes	❌ No
Know the position	❌ No	✅ Yes
First/last N rows	❌ No	✅ Yes
Specific company/ID	✅ Yes	❌ No
Random sampling	❌ No	✅ Yes
Business logic	✅ Yes	❌ No
Generic operations	❌ No	✅ Yes

Important Reminders

Positions, not labels: Use integers (0, 1, 2, …)
Exclusive slicing: End position not included in slice
Negative indexing: Count from the end with -1, -2, etc.
Use lists for multiple: [0, 2, 4] for specific positions
Prefer .loc[]: More readable for production code
Use .iloc[]: For truly position-based operations

Next Steps

You can now select data using both labels and positions. In the next lesson, you will learn Series operations and value counting—essential for analyzing categorical data.

Continue to Lesson 5 - Series Operations & Value Counts

Learn to perform calculations on Series and analyze categorical data

Back to Lesson 3 - Selecting with .loc[]

Review label-based selection using meaningful names

Master Both Selection Methods

Label-based selection with .loc[] makes code readable. Position-based selection with .iloc[] handles generic operations. Knowing both makes you a versatile pandas user.

Prefer .loc[] when possible—it’s more maintainable. Use .iloc[] when position-based logic makes sense!

Lesson 3 - Selecting Data with .loc[]

Lesson 5 - Series Operations and Value Counts

Courses

DATATWEETS

Title here

Lesson 4 - Selecting Data with .iloc[]

Position-Based Selection

Understanding .iloc[]

Selecting Rows by Position

Single Row Selection

Negative Indexing

Row Slicing

Selecting Specific Rows

Selecting Rows and Columns Together

Single Element

Row and Column Slices

Selecting Specific Columns

Last N Rows

Step Slicing

Specific Rows and Columns

When to Use .loc[] vs .iloc[]

Use .loc[] When:

Use .iloc[] When:

Side-by-Side Comparison

Real-World Scenario

When .iloc[] Makes Sense

Selection Syntax Summary

Practice Exercises

Exercise 1: Basic .iloc[] Operations

Exercise 2: Rows and Columns Together

Exercise 3: .loc[] vs .iloc[] Comparison

Exercise 4: Advanced Selection

Summary

Key Concepts

Syntax Reference

.loc[] vs .iloc[] Decision Guide

Important Reminders

Next Steps

Continue to Lesson 5 - Series Operations & Value Counts

Back to Lesson 3 - Selecting with .loc[]

Master Both Selection Methods

Lesson 4 - Selecting Data with .iloc[]

Position-Based Selection#

Understanding .iloc[]#

Selecting Rows by Position#

Single Row Selection#

Negative Indexing#

Row Slicing#

Selecting Specific Rows#

Selecting Rows and Columns Together#

Single Element#

Row and Column Slices#

Selecting Specific Columns#

Last N Rows#

Step Slicing#

Specific Rows and Columns#

When to Use .loc[] vs .iloc[]#

Use .loc[] When:#

Use .iloc[] When:#

Side-by-Side Comparison#

Real-World Scenario#

When .iloc[] Makes Sense#

Selection Syntax Summary#

Practice Exercises#

Exercise 1: Basic .iloc[] Operations#

Exercise 2: Rows and Columns Together#

Exercise 3: .loc[] vs .iloc[] Comparison#

Exercise 4: Advanced Selection#

Summary#

Key Concepts#

Syntax Reference#

.loc[] vs .iloc[] Decision Guide#

Important Reminders#

Next Steps#

Continue to Lesson 5 - Series Operations & Value Counts

Back to Lesson 3 - Selecting with .loc[]

Master Both Selection Methods#

Position-Based Selection

Understanding .iloc[]

Selecting Rows by Position

Single Row Selection

Negative Indexing

Row Slicing

Selecting Specific Rows

Selecting Rows and Columns Together

Single Element

Row and Column Slices

Selecting Specific Columns

Last N Rows

Step Slicing

Specific Rows and Columns

When to Use .loc[] vs .iloc[]

Use .loc[] When:

Use .iloc[] When:

Side-by-Side Comparison

Real-World Scenario

When .iloc[] Makes Sense

Selection Syntax Summary

Practice Exercises

Exercise 1: Basic .iloc[] Operations

Exercise 2: Rows and Columns Together

Exercise 3: .loc[] vs .iloc[] Comparison

Exercise 4: Advanced Selection

Summary

Key Concepts

Syntax Reference

.loc[] vs .iloc[] Decision Guide

Important Reminders

Next Steps

Master Both Selection Methods