Lesson 14 - Filter and Map Functions

Processing Data Efficiently

When working with data, you’ll frequently need to:

  • Filter data to select only certain items
  • Transform data by applying an operation to every item

Python provides two powerful built-in functions for these tasks: filter() and map(). Combined with lambda functions, they create concise, readable code for data processing.

By the end of this lesson, you’ll be able to:

  • Use filter() to select data based on conditions
  • Use map() to transform data
  • Combine filter and map with lambda functions
  • Chain multiple operations together
  • Choose between filter/map and list comprehensions

Let’s start with the filter() function.


The filter() Function

The filter() function selects items from an iterable that pass a test function. It returns a filter object that you can convert to a list.

Syntax:

filter(function, iterable)
  • function: A function that returns True or False
  • iterable: The sequence to filter (list, tuple, etc.)

Example: Filter ratings above 4.0

Traditional approach:

ratings = [4.2, 3.8, 4.5, 4.0, 3.9, 4.6]
high_ratings = []

for rating in ratings:
    if rating > 4.0:
        high_ratings.append(rating)

print(high_ratings)

Output:

[4.2, 4.5, 4.6]

Using filter() with a regular function:

def is_high_rating(rating):
    return rating > 4.0

ratings = [4.2, 3.8, 4.5, 4.0, 3.9, 4.6]
high_ratings = list(filter(is_high_rating, ratings))

print(high_ratings)

Output:

[4.2, 4.5, 4.6]

Using filter() with lambda (most common):

ratings = [4.2, 3.8, 4.5, 4.0, 3.9, 4.6]
high_ratings = list(filter(lambda rating: rating > 4.0, ratings))

print(high_ratings)

Output:

[4.2, 4.5, 4.6]

The lambda function lambda rating: rating > 4.0 returns True for ratings above 4.0 and False otherwise. filter() keeps only the items where the function returns True.


More filter() Examples

Filter even numbers:

numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
even_numbers = list(filter(lambda x: x % 2 == 0, numbers))

print(even_numbers)

Output:

[2, 4, 6, 8, 10]

Filter books priced under $10:

books = [
    {"title": "Desert Echoes", "price": 15.00},
    {"title": "Moon Over Pine", "price": 7.50},
    {"title": "Berry Tales", "price": 0.00},
    {"title": "Golden Harvest", "price": 9.99}
]

affordable = list(filter(lambda book: book["price"] < 10, books))

for book in affordable:
    print(f"{book['title']}: ${book['price']:.2f}")

Output:

Moon Over Pine: $7.50
Berry Tales: $0.00
Golden Harvest: $9.99

Filter titles starting with a specific letter:

titles = ["Desert Echoes", "Moon Over Pine", "Berry Tales", "Bamboo Dreams", "Desert Rose"]
d_titles = list(filter(lambda title: title.startswith('D'), titles))

print(d_titles)

Output:

['Desert Echoes', 'Desert Rose']

Filter non-empty strings:

items = ["apple", "", "banana", "  ", "cherry", ""]
non_empty = list(filter(lambda item: item.strip(), items))

print(non_empty)

Output:

['apple', 'banana', 'cherry']

Exercise: Create a list of books with ratings and prices. Use filter() to get only books with ratings above 4.5 AND prices under $15.


The map() Function

The map() function applies a function to every item in an iterable and returns a map object with the results.

Syntax:

map(function, iterable)
  • function: A function to apply to each item
  • iterable: The sequence to process

Example: Double all numbers

Traditional approach:

numbers = [1, 2, 3, 4, 5]
doubled = []

for num in numbers:
    doubled.append(num * 2)

print(doubled)

Output:

[2, 4, 6, 8, 10]

Using map() with lambda:

numbers = [1, 2, 3, 4, 5]
doubled = list(map(lambda x: x * 2, numbers))

print(doubled)

Output:

[2, 4, 6, 8, 10]

More map() Examples

Convert prices to euros:

prices_usd = [12.99, 7.50, 15.00, 9.99]
prices_eur = list(map(lambda price: price * 0.85, prices_usd))

print(prices_eur)

Output:

[11.0415, 6.375, 12.75, 8.4915]

Extract ratings from book dictionaries:

books = [
    {"title": "Desert Echoes", "rating": 4.4},
    {"title": "Moon Over Pine", "rating": 4.6},
    {"title": "Berry Tales", "rating": 4.5}
]

ratings = list(map(lambda book: book["rating"], books))
print(ratings)

Output:

[4.4, 4.6, 4.5]

Convert strings to uppercase:

titles = ["desert echoes", "moon over pine", "berry tales"]
uppercase = list(map(lambda title: title.upper(), titles))

print(uppercase)

Output:

['DESERT ECHOES', 'MOON OVER PINE', 'BERRY TALES']

Calculate total prices with tax:

prices = [12.99, 7.50, 15.00, 9.99]
tax_rate = 0.08

prices_with_tax = list(map(lambda price: price * (1 + tax_rate), prices))

for i, price in enumerate(prices_with_tax):
    print(f"${prices[i]:.2f} → ${price:.2f}")

Output:

$12.99 → $14.03
$7.50 → $8.10
$15.00 → $16.20
$9.99 → $10.79

Exercise: Create a list of book prices. Use map() to apply a 15% discount to each price and round to 2 decimal places.


map() with Multiple Iterables

map() can accept multiple iterables. The function should then accept multiple parameters:

titles = ["Desert Echoes", "Moon Over Pine", "Berry Tales"]
prices = [15.00, 7.50, 0.00]

# Combine title and price
descriptions = list(map(lambda title, price: f"{title}: ${price:.2f}", titles, prices))

for desc in descriptions:
    print(desc)

Output:

Desert Echoes: $15.00
Moon Over Pine: $7.50
Berry Tales: $0.00

Calculate revenue from price and quantity:

prices = [15.00, 7.50, 9.99]
quantities = [50, 100, 75]

revenues = list(map(lambda price, qty: price * qty, prices, quantities))

for i, revenue in enumerate(revenues):
    print(f"Item {i+1}: ${revenue:.2f}")

Output:

Item 1: $750.00
Item 2: $750.00
Item 3: $749.25

Combining filter() and map()

The real power comes from chaining filter() and map() together:

Filter high-rated books, then extract their titles:

books = [
    {"title": "Desert Echoes", "rating": 4.4},
    {"title": "Moon Over Pine", "rating": 4.6},
    {"title": "Berry Tales", "rating": 4.5},
    {"title": "Golden Harvest", "rating": 4.3}
]

# First filter, then map
high_rated = filter(lambda book: book["rating"] >= 4.5, books)
titles = list(map(lambda book: book["title"], high_rated))

print(titles)

Output:

['Moon Over Pine', 'Berry Tales']

Or chain them inline:

titles = list(map(
    lambda book: book["title"],
    filter(lambda book: book["rating"] >= 4.5, books)
))

print(titles)

Output:

['Moon Over Pine', 'Berry Tales']

Filter affordable books and apply discount:

books = [
    {"title": "Desert Echoes", "price": 15.00},
    {"title": "Moon Over Pine", "price": 7.50},
    {"title": "Berry Tales", "price": 0.00},
    {"title": "Golden Harvest", "price": 9.99}
]

# Filter books under $10, apply 10% discount
affordable = filter(lambda book: 0 < book["price"] < 10, books)
discounted = list(map(lambda book: {**book, "price": book["price"] * 0.9}, affordable))

for book in discounted:
    print(f"{book['title']}: ${book['price']:.2f}")

Output:

Moon Over Pine: $6.75
Golden Harvest: $8.99

filter() and map() vs List Comprehensions

You can achieve the same results with list comprehensions, which are often more readable:

Using filter() and map():

numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
result = list(map(lambda x: x ** 2, filter(lambda x: x % 2 == 0, numbers)))
print(result)

Using list comprehension:

numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
result = [x ** 2 for x in numbers if x % 2 == 0]
print(result)

Output (both):

[4, 16, 36, 64, 100]

When to use filter()/map():

  • Working with existing functions (not lambda)
  • Need lazy evaluation (processing data on-demand)
  • Prefer functional programming style

When to use list comprehensions:

  • Simple transformations
  • Want more readable code
  • Need the full list immediately

Example with existing function:

def is_valid_rating(rating):
    """Check if rating is valid (0-5)."""
    return 0 <= rating <= 5

ratings = [4.2, -1, 4.5, 6.0, 4.0, 3.9]
valid_ratings = list(filter(is_valid_rating, ratings))

print(valid_ratings)

Output:

[4.2, 4.5, 4.0, 3.9]

Practical Examples

Clean and process data:

import csv

# Sample data
raw_ratings = ["4.2", "invalid", "4.5", "", "4.0", "3.9"]

# Filter out invalid entries
def is_valid(value):
    try:
        float(value)
        return True
    except ValueError:
        return False

valid_ratings = filter(is_valid, raw_ratings)

# Convert to float
ratings = list(map(float, valid_ratings))

print(ratings)
print(f"Average: {sum(ratings) / len(ratings):.2f}")

Output:

[4.2, 4.5, 4.0, 3.9]
Average: 4.15

Calculate statistics:

books = [
    {"title": "Desert Echoes", "price": 15.00, "reviews": 1724500, "rating": 4.4},
    {"title": "Moon Over Pine", "price": 7.50, "reviews": 899000, "rating": 4.6},
    {"title": "Berry Tales", "price": 0.00, "reviews": 985500, "rating": 4.5},
    {"title": "Golden Harvest", "price": 9.99, "reviews": 990000, "rating": 4.3}
]

# Filter paid books
paid_books = filter(lambda book: book["price"] > 0, books)

# Calculate revenue potential (price * reviews)
potential_revenues = map(lambda book: book["price"] * book["reviews"], paid_books)

total_potential = sum(potential_revenues)
print(f"Total potential revenue: ${total_potential:,.2f}")

Output:

Total potential revenue: $32,757,510.00

Data transformation pipeline:

# Raw sales data
sales = [
    "Desert Echoes,15.00,50",
    "Moon Over Pine,7.50,100",
    "invalid,data,here",
    "Berry Tales,0.00,200",
    "Golden Harvest,9.99,75"
]

# Parse valid lines
def parse_sale(line):
    try:
        parts = line.split(',')
        return {
            "title": parts[0],
            "price": float(parts[1]),
            "quantity": int(parts[2])
        }
    except (ValueError, IndexError):
        return None

# Filter and map
parsed = map(parse_sale, sales)
valid_sales = filter(lambda x: x is not None, parsed)

# Calculate revenue
revenues = map(lambda sale: sale["price"] * sale["quantity"], valid_sales)
total = sum(revenues)

print(f"Total revenue: ${total:.2f}")

Output:

Total revenue: $1499.25

Using filter(None, iterable)

A special case: filter(None, iterable) removes all “falsy” values (empty strings, 0, None, False, etc.):

mixed_data = [1, 0, "hello", "", None, "world", False, 42]
cleaned = list(filter(None, mixed_data))

print(cleaned)

Output:

[1, 'hello', 'world', 42]

This is useful for cleaning datasets:

titles = ["Desert Echoes", "", "Moon Over Pine", None, "Berry Tales", "  "]

# Remove empty/None values (but keep strings with spaces)
cleaned = list(filter(None, titles))
print(cleaned)

# Remove empty and whitespace-only strings
truly_cleaned = list(filter(lambda x: x and x.strip(), titles))
print(truly_cleaned)

Output:

['Desert Echoes', 'Moon Over Pine', 'Berry Tales', '  ']
['Desert Echoes', 'Moon Over Pine', 'Berry Tales']

Performance Considerations

filter() and map() return iterators, which means they’re lazy—they don’t compute results until you ask for them. This can save memory with large datasets:

# This doesn't compute anything yet
huge_numbers = range(1000000)
filtered = filter(lambda x: x % 2 == 0, huge_numbers)
squared = map(lambda x: x ** 2, filtered)

# Results are computed only when you iterate or convert to list
first_five = []
for i, num in enumerate(squared):
    if i >= 5:
        break
    first_five.append(num)

print(first_five)

Output:

[0, 4, 16, 36, 64]

Only the first five even squares were calculated, not all million!


Looking Ahead

Congratulations! You’ve now completed the Python Basics module. You’ve learned:

  • Basic Python syntax and operations
  • Variables and data types
  • Lists and for loops
  • Conditional statements
  • Dictionaries
  • Functions
  • Working with files
  • Exception handling
  • List comprehensions
  • Lambda functions
  • Filter and map functions

These fundamentals provide a solid foundation for data analytics with Python. You’re now ready to move on to more advanced topics like:

  • Working with pandas for data manipulation
  • Data visualization with matplotlib and seaborn
  • Statistical analysis
  • Working with APIs and databases
  • Machine learning fundamentals

Exercise: Given this bookstore dataset:

books = [
    {"title": "Desert Echoes", "price": 15.00, "rating": 4.4, "genre": "Fiction"},
    {"title": "Moon Over Pine", "price": 7.50, "rating": 4.6, "genre": "Poetry"},
    {"title": "Berry Tales", "price": 0.00, "rating": 4.5, "genre": "Children"},
    {"title": "Golden Harvest", "price": 9.99, "rating": 4.3, "genre": "Fiction"},
    {"title": "Data Science 101", "price": 29.99, "rating": 4.7, "genre": "Technical"},
    {"title": "Python Mastery", "price": 24.99, "rating": 4.8, "genre": "Technical"}
]

Use filter() and map() to:

  1. Find all technical books with ratings above 4.5 and extract their titles
  2. Calculate the average price of fiction books
  3. Create a list of dictionaries with title and discounted price (20% off) for all books priced over $10
  4. Find the total number of non-free books with ratings of 4.5 or higher

Next Steps

Congratulations! You’ve completed the Python Basics module. You now have a solid foundation in Python programming including:

  • Variables, data types, and operators
  • For loops and conditional statements
  • Lists and dictionaries
  • Functions and lambda expressions
  • List comprehensions, filter(), and map()

Ready for the next step?

Continue your learning journey with NumPy Fundamentals
Learn to work with numerical arrays and perform efficient computations for data analysis

Start NumPy Fundamentals →

Or explore other advanced topics: