Lesson 14 - Filter and Map Functions
On this page
- Processing Data Efficiently
- The filter() Function
- More filter() Examples
- The map() Function
- More map() Examples
- map() with Multiple Iterables
- Combining filter() and map()
- filter() and map() vs List Comprehensions
- Practical Examples
- Using filter(None, iterable)
- Performance Considerations
- Looking Ahead
- Next Steps
Processing Data Efficiently
When working with data, you’ll frequently need to:
- Filter data to select only certain items
- Transform data by applying an operation to every item
Python provides two powerful built-in functions for these tasks: filter() and map(). Combined with lambda functions, they create concise, readable code for data processing.
By the end of this lesson, you’ll be able to:
- Use
filter()to select data based on conditions - Use
map()to transform data - Combine filter and map with lambda functions
- Chain multiple operations together
- Choose between filter/map and list comprehensions
Let’s start with the filter() function.
The filter() Function
The filter() function selects items from an iterable that pass a test function. It returns a filter object that you can convert to a list.
Syntax:
filter(function, iterable)function: A function that returnsTrueorFalseiterable: The sequence to filter (list, tuple, etc.)
Example: Filter ratings above 4.0
Traditional approach:
ratings = [4.2, 3.8, 4.5, 4.0, 3.9, 4.6]
high_ratings = []
for rating in ratings:
if rating > 4.0:
high_ratings.append(rating)
print(high_ratings)Output:
[4.2, 4.5, 4.6]Using filter() with a regular function:
def is_high_rating(rating):
return rating > 4.0
ratings = [4.2, 3.8, 4.5, 4.0, 3.9, 4.6]
high_ratings = list(filter(is_high_rating, ratings))
print(high_ratings)Output:
[4.2, 4.5, 4.6]Using filter() with lambda (most common):
ratings = [4.2, 3.8, 4.5, 4.0, 3.9, 4.6]
high_ratings = list(filter(lambda rating: rating > 4.0, ratings))
print(high_ratings)Output:
[4.2, 4.5, 4.6]The lambda function lambda rating: rating > 4.0 returns True for ratings above 4.0 and False otherwise. filter() keeps only the items where the function returns True.
More filter() Examples
Filter even numbers:
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
even_numbers = list(filter(lambda x: x % 2 == 0, numbers))
print(even_numbers)Output:
[2, 4, 6, 8, 10]Filter books priced under $10:
books = [
{"title": "Desert Echoes", "price": 15.00},
{"title": "Moon Over Pine", "price": 7.50},
{"title": "Berry Tales", "price": 0.00},
{"title": "Golden Harvest", "price": 9.99}
]
affordable = list(filter(lambda book: book["price"] < 10, books))
for book in affordable:
print(f"{book['title']}: ${book['price']:.2f}")Output:
Moon Over Pine: $7.50
Berry Tales: $0.00
Golden Harvest: $9.99Filter titles starting with a specific letter:
titles = ["Desert Echoes", "Moon Over Pine", "Berry Tales", "Bamboo Dreams", "Desert Rose"]
d_titles = list(filter(lambda title: title.startswith('D'), titles))
print(d_titles)Output:
['Desert Echoes', 'Desert Rose']Filter non-empty strings:
items = ["apple", "", "banana", " ", "cherry", ""]
non_empty = list(filter(lambda item: item.strip(), items))
print(non_empty)Output:
['apple', 'banana', 'cherry']Exercise:
Create a list of books with ratings and prices. Use filter() to get only books with ratings above 4.5 AND prices under $15.
The map() Function
The map() function applies a function to every item in an iterable and returns a map object with the results.
Syntax:
map(function, iterable)function: A function to apply to each itemiterable: The sequence to process
Example: Double all numbers
Traditional approach:
numbers = [1, 2, 3, 4, 5]
doubled = []
for num in numbers:
doubled.append(num * 2)
print(doubled)Output:
[2, 4, 6, 8, 10]Using map() with lambda:
numbers = [1, 2, 3, 4, 5]
doubled = list(map(lambda x: x * 2, numbers))
print(doubled)Output:
[2, 4, 6, 8, 10]More map() Examples
Convert prices to euros:
prices_usd = [12.99, 7.50, 15.00, 9.99]
prices_eur = list(map(lambda price: price * 0.85, prices_usd))
print(prices_eur)Output:
[11.0415, 6.375, 12.75, 8.4915]Extract ratings from book dictionaries:
books = [
{"title": "Desert Echoes", "rating": 4.4},
{"title": "Moon Over Pine", "rating": 4.6},
{"title": "Berry Tales", "rating": 4.5}
]
ratings = list(map(lambda book: book["rating"], books))
print(ratings)Output:
[4.4, 4.6, 4.5]Convert strings to uppercase:
titles = ["desert echoes", "moon over pine", "berry tales"]
uppercase = list(map(lambda title: title.upper(), titles))
print(uppercase)Output:
['DESERT ECHOES', 'MOON OVER PINE', 'BERRY TALES']Calculate total prices with tax:
prices = [12.99, 7.50, 15.00, 9.99]
tax_rate = 0.08
prices_with_tax = list(map(lambda price: price * (1 + tax_rate), prices))
for i, price in enumerate(prices_with_tax):
print(f"${prices[i]:.2f} → ${price:.2f}")Output:
$12.99 → $14.03
$7.50 → $8.10
$15.00 → $16.20
$9.99 → $10.79Exercise:
Create a list of book prices. Use map() to apply a 15% discount to each price and round to 2 decimal places.
map() with Multiple Iterables
map() can accept multiple iterables. The function should then accept multiple parameters:
titles = ["Desert Echoes", "Moon Over Pine", "Berry Tales"]
prices = [15.00, 7.50, 0.00]
# Combine title and price
descriptions = list(map(lambda title, price: f"{title}: ${price:.2f}", titles, prices))
for desc in descriptions:
print(desc)Output:
Desert Echoes: $15.00
Moon Over Pine: $7.50
Berry Tales: $0.00Calculate revenue from price and quantity:
prices = [15.00, 7.50, 9.99]
quantities = [50, 100, 75]
revenues = list(map(lambda price, qty: price * qty, prices, quantities))
for i, revenue in enumerate(revenues):
print(f"Item {i+1}: ${revenue:.2f}")Output:
Item 1: $750.00
Item 2: $750.00
Item 3: $749.25Combining filter() and map()
The real power comes from chaining filter() and map() together:
Filter high-rated books, then extract their titles:
books = [
{"title": "Desert Echoes", "rating": 4.4},
{"title": "Moon Over Pine", "rating": 4.6},
{"title": "Berry Tales", "rating": 4.5},
{"title": "Golden Harvest", "rating": 4.3}
]
# First filter, then map
high_rated = filter(lambda book: book["rating"] >= 4.5, books)
titles = list(map(lambda book: book["title"], high_rated))
print(titles)Output:
['Moon Over Pine', 'Berry Tales']Or chain them inline:
titles = list(map(
lambda book: book["title"],
filter(lambda book: book["rating"] >= 4.5, books)
))
print(titles)Output:
['Moon Over Pine', 'Berry Tales']Filter affordable books and apply discount:
books = [
{"title": "Desert Echoes", "price": 15.00},
{"title": "Moon Over Pine", "price": 7.50},
{"title": "Berry Tales", "price": 0.00},
{"title": "Golden Harvest", "price": 9.99}
]
# Filter books under $10, apply 10% discount
affordable = filter(lambda book: 0 < book["price"] < 10, books)
discounted = list(map(lambda book: {**book, "price": book["price"] * 0.9}, affordable))
for book in discounted:
print(f"{book['title']}: ${book['price']:.2f}")Output:
Moon Over Pine: $6.75
Golden Harvest: $8.99filter() and map() vs List Comprehensions
You can achieve the same results with list comprehensions, which are often more readable:
Using filter() and map():
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
result = list(map(lambda x: x ** 2, filter(lambda x: x % 2 == 0, numbers)))
print(result)Using list comprehension:
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
result = [x ** 2 for x in numbers if x % 2 == 0]
print(result)Output (both):
[4, 16, 36, 64, 100]When to use filter()/map():
- Working with existing functions (not lambda)
- Need lazy evaluation (processing data on-demand)
- Prefer functional programming style
When to use list comprehensions:
- Simple transformations
- Want more readable code
- Need the full list immediately
Example with existing function:
def is_valid_rating(rating):
"""Check if rating is valid (0-5)."""
return 0 <= rating <= 5
ratings = [4.2, -1, 4.5, 6.0, 4.0, 3.9]
valid_ratings = list(filter(is_valid_rating, ratings))
print(valid_ratings)Output:
[4.2, 4.5, 4.0, 3.9]Practical Examples
Clean and process data:
import csv
# Sample data
raw_ratings = ["4.2", "invalid", "4.5", "", "4.0", "3.9"]
# Filter out invalid entries
def is_valid(value):
try:
float(value)
return True
except ValueError:
return False
valid_ratings = filter(is_valid, raw_ratings)
# Convert to float
ratings = list(map(float, valid_ratings))
print(ratings)
print(f"Average: {sum(ratings) / len(ratings):.2f}")Output:
[4.2, 4.5, 4.0, 3.9]
Average: 4.15Calculate statistics:
books = [
{"title": "Desert Echoes", "price": 15.00, "reviews": 1724500, "rating": 4.4},
{"title": "Moon Over Pine", "price": 7.50, "reviews": 899000, "rating": 4.6},
{"title": "Berry Tales", "price": 0.00, "reviews": 985500, "rating": 4.5},
{"title": "Golden Harvest", "price": 9.99, "reviews": 990000, "rating": 4.3}
]
# Filter paid books
paid_books = filter(lambda book: book["price"] > 0, books)
# Calculate revenue potential (price * reviews)
potential_revenues = map(lambda book: book["price"] * book["reviews"], paid_books)
total_potential = sum(potential_revenues)
print(f"Total potential revenue: ${total_potential:,.2f}")Output:
Total potential revenue: $32,757,510.00Data transformation pipeline:
# Raw sales data
sales = [
"Desert Echoes,15.00,50",
"Moon Over Pine,7.50,100",
"invalid,data,here",
"Berry Tales,0.00,200",
"Golden Harvest,9.99,75"
]
# Parse valid lines
def parse_sale(line):
try:
parts = line.split(',')
return {
"title": parts[0],
"price": float(parts[1]),
"quantity": int(parts[2])
}
except (ValueError, IndexError):
return None
# Filter and map
parsed = map(parse_sale, sales)
valid_sales = filter(lambda x: x is not None, parsed)
# Calculate revenue
revenues = map(lambda sale: sale["price"] * sale["quantity"], valid_sales)
total = sum(revenues)
print(f"Total revenue: ${total:.2f}")Output:
Total revenue: $1499.25Using filter(None, iterable)
A special case: filter(None, iterable) removes all “falsy” values (empty strings, 0, None, False, etc.):
mixed_data = [1, 0, "hello", "", None, "world", False, 42]
cleaned = list(filter(None, mixed_data))
print(cleaned)Output:
[1, 'hello', 'world', 42]This is useful for cleaning datasets:
titles = ["Desert Echoes", "", "Moon Over Pine", None, "Berry Tales", " "]
# Remove empty/None values (but keep strings with spaces)
cleaned = list(filter(None, titles))
print(cleaned)
# Remove empty and whitespace-only strings
truly_cleaned = list(filter(lambda x: x and x.strip(), titles))
print(truly_cleaned)Output:
['Desert Echoes', 'Moon Over Pine', 'Berry Tales', ' ']
['Desert Echoes', 'Moon Over Pine', 'Berry Tales']Performance Considerations
filter() and map() return iterators, which means they’re lazy—they don’t compute results until you ask for them. This can save memory with large datasets:
# This doesn't compute anything yet
huge_numbers = range(1000000)
filtered = filter(lambda x: x % 2 == 0, huge_numbers)
squared = map(lambda x: x ** 2, filtered)
# Results are computed only when you iterate or convert to list
first_five = []
for i, num in enumerate(squared):
if i >= 5:
break
first_five.append(num)
print(first_five)Output:
[0, 4, 16, 36, 64]Only the first five even squares were calculated, not all million!
Looking Ahead
Congratulations! You’ve now completed the Python Basics module. You’ve learned:
- Basic Python syntax and operations
- Variables and data types
- Lists and for loops
- Conditional statements
- Dictionaries
- Functions
- Working with files
- Exception handling
- List comprehensions
- Lambda functions
- Filter and map functions
These fundamentals provide a solid foundation for data analytics with Python. You’re now ready to move on to more advanced topics like:
- Working with pandas for data manipulation
- Data visualization with matplotlib and seaborn
- Statistical analysis
- Working with APIs and databases
- Machine learning fundamentals
Exercise: Given this bookstore dataset:
books = [
{"title": "Desert Echoes", "price": 15.00, "rating": 4.4, "genre": "Fiction"},
{"title": "Moon Over Pine", "price": 7.50, "rating": 4.6, "genre": "Poetry"},
{"title": "Berry Tales", "price": 0.00, "rating": 4.5, "genre": "Children"},
{"title": "Golden Harvest", "price": 9.99, "rating": 4.3, "genre": "Fiction"},
{"title": "Data Science 101", "price": 29.99, "rating": 4.7, "genre": "Technical"},
{"title": "Python Mastery", "price": 24.99, "rating": 4.8, "genre": "Technical"}
]Use filter() and map() to:
- Find all technical books with ratings above 4.5 and extract their titles
- Calculate the average price of fiction books
- Create a list of dictionaries with title and discounted price (20% off) for all books priced over $10
- Find the total number of non-free books with ratings of 4.5 or higher
Next Steps
Congratulations! You’ve completed the Python Basics module. You now have a solid foundation in Python programming including:
- Variables, data types, and operators
- For loops and conditional statements
- Lists and dictionaries
- Functions and lambda expressions
- List comprehensions, filter(), and map()
Ready for the next step?
Continue your learning journey with NumPy Fundamentals
Learn to work with numerical arrays and perform efficient computations for data analysis
Or explore other advanced topics:
- Python Advanced - Object-oriented programming, decorators, and advanced Python features
- Pandas Data Analysis - Work with DataFrames for real-world data analysis
- Data Visualization - Create charts and graphs with Matplotlib