Lesson 9 - Python Functions: Arguments, Parameters, and Debugging
On this page
- Extracting Values from Any Column
- Creating Frequency Tables from a List
- A More Efficient Approach: One Function to Do It All
- Making Our Function Truly Reusable with Multiple Parameters
- Keyword vs Positional Arguments: What’s the Difference?
- Next up — let’s combine functions to create more powerful operations!
- Combining Functions: Building Blocks for Smarter Code
- In the next section, we’ll talk about something every programmer faces: errors, also known as bugs — and how to debug them.
- Debugging Functions: How to Track Down and Fix Errors
- Review and Takeaways
In this lesson, we’ll take our understanding of Python functions a step further. You already know how to define and use functions. Now you’ll learn how to make them more powerful by passing arguments, handling multiple parameters, writing reusable code, and dealing with bugs through debugging.
By the end of this lesson, you’ll be able to:
- Create more flexible functions using parameters
- Pass arguments by position or by name
- Reuse functions to make your code cleaner and smarter
- Handle and fix common function-related bugs
Let’s get started with how to extract values from any column in a dataset using a function.
Extracting Values from Any Column
When working with real-world datasets, we often need to focus on just one column at a time — for instance, the genre or the content rating. Instead of manually writing the same logic over and over, we can write a function to extract values from any column we want.
Our dataset is a list of lists. Each row is an app, and each column is an attribute such as name, rating, size, and so on. To get all the values from a particular column, we need to:
- Loop over the dataset (skipping the header).
- Pick the value at the index we care about.
- Add that value to a new list.
Before we build complex functions, let’s work with something very common in data analysis: retrieving data from a specific column of a dataset.
About the Dataset
We’re using a dataset of over 7,000 iOS apps collected from the Apple App Store (Download Here). The data is stored as a list of lists in Python — meaning:
- Each row is a list that represents one app.
- Each column holds a different piece of information, such as app name, rating, size, genre, or content rating.
Here’s a simplified view of the dataset structure:
Index | Column Name | Example |
---|---|---|
0 | ID | 284882215 |
1 | Track Name | Facebook |
2 | Size (in bytes) | 389879808 |
4 | Price | 0.0 |
5 | Rating Count Total | 2974676 |
7 | User Rating | 3.5 |
10 | Content Rating | 4+ |
11 | Genre | Social Networking |
The dataset is stored in a variable named apps_data
. The first row contains the column headers, so we usually skip it when processing the data.
What Are We Trying to Do?
Let’s say you want to analyze a single column — maybe user ratings, content ratings, or app genres. Instead of rewriting similar code every time, we can create a reusable function that lets us extract values from any column, simply by specifying its index.
Manual Approach (Before Using Functions)
Here’s how you might normally extract the values of the “Content Rating” column (which is at index 10):
content_ratings = []
for row in apps_data[1:]: # skip the header
content_rating = row[10]
content_ratings.append(content_rating)
That’s fine — but what if we want to do this for different columns repeatedly? We’d have to copy and tweak the code every time.
Let’s make this better using a function.
Building a Reusable Column Extraction Function
Here’s how we can wrap that logic inside a function called extract()
:
def extract(index):
column = []
for row in apps_data[1:]:
value = row[index]
column.append(value)
return column
What this function does:
- Takes one input:
index
, the column number we want to extract - Loops through the dataset (skipping the header)
- Picks the value at the given index from each row
- Adds that value to a new list
- Returns the final list
Let’s try using this to extract the app genres:
genres = extract(11)
Now, genres
will be a list containing the genre of every app in the dataset — like this:
['Social Networking', 'Photo & Video', 'Games', 'Music', ...]
Simple, clean, and reusable!
Creating Frequency Tables from a List
In data analysis, a frequency table is a simple but powerful tool. It tells us how often each unique value appears in a dataset.
For example, if you have a list of app content ratings like ['4+', '4+', '12+', '17+', '4+']
, a frequency table will tell you:
'4+'
appears 3 times'12+'
appears 1 time'17+'
appears 1 time
Why Use Frequency Tables?
They’re a great way to answer questions like:
- What’s the most common content rating?
- How many apps belong to each genre?
- What’s the distribution of user ratings?
Once we extract a column like the one with genres or ratings, we can pass it into a new function that creates a frequency table for us.
Building the Frequency Table Function
Let’s write a function named freq_table()
that takes in a list and returns a dictionary showing the frequency of each unique value in that list:
def freq_table(column):
frequency_table = {}
for value in column:
if value in frequency_table:
frequency_table[value] += 1
else:
frequency_table[value] = 1
return frequency_table
Here’s what’s happening in this function:
- We create an empty dictionary called
frequency_table
. - Then we loop through each item in the input list.
- If that item already exists as a key in the dictionary, we add 1 to its count.
- If not, we add it to the dictionary with an initial count of 1.
- Finally, we return the dictionary with all the counts.
Using Both Functions Together
Now we can combine both extract()
and freq_table()
to build a frequency table for any column.
Let’s say we want to see how many apps exist in each genre. The “prime_genre” column is at index 11
:
genres = extract(11)
genres_ft = freq_table(genres)
Let’s print the result:
print(genres_ft)
You’ll get something like this:
{
'Games': 3862,
'Entertainment': 535,
'Education': 453,
'Photo & Video': 349,
...
}
Now we can clearly see which genres are most common — no manual counting required!
A More Efficient Approach: One Function to Do It All
In the previous section, we used two separate functions to build a frequency table:
extract()
– to get all values from a specific columnfreq_table()
– to count the frequency of those values
That works well, but we can simplify things by combining both steps into a single function. This makes our code cleaner and easier to reuse.
Writing One Function Instead of Two
Let’s say we want to generate a frequency table for the column at index 10
(which stores content ratings like '4+'
, '12+'
, etc.). Instead of first extracting the values, we can just count them directly while looping through the dataset.
Here’s how we can do that:
def freq_table(index):
frequency_table = {}
for row in apps_data[1:]: # skip header
value = row[index]
if value in frequency_table:
frequency_table[value] += 1
else:
frequency_table[value] = 1
return frequency_table
Let’s break this down:
- This function accepts one input: the column index we want to analyze.
- We loop through the dataset (ignoring the header row).
- For each row, we grab the value in the column we’re interested in.
- We update the dictionary with the count of that value.
Try It Out
Let’s use our new freq_table()
function to analyze user ratings, which are in column index 7
:
ratings_ft = freq_table(7)
print(ratings_ft)
This might give you something like:
{
'4.5': 2663,
'4.0': 1553,
'3.5': 702,
'3.0': 455,
...
}
Now you can instantly see how many apps have each rating. No need to extract the column manually anymore!
In the next section, we’ll take things further and make this function even more reusable by allowing it to work on different datasets — not just apps_data
.
Making Our Function Truly Reusable with Multiple Parameters
So far, our freq_table()
function works well, but it has a limitation — it only works with the apps_data
dataset. What if you want to reuse it for another dataset in a different project?
To fix this, we’ll improve the function by adding a second parameter: one for the dataset and one for the column index.
Why Multiple Parameters?
Let’s say you have two datasets:
apps_data
– iOS apps datasetgames_data
– a different dataset about games
If your function is hardcoded to only work with apps_data
, it becomes useless for anything else. But if you pass the dataset as a parameter, it becomes flexible and reusable.
Updated Function: More Powerful, Still Simple
Here’s how we define the improved version:
def freq_table(data_set, index):
frequency_table = {}
for row in data_set[1:]: # skip header
value = row[index]
if value in frequency_table:
frequency_table[value] += 1
else:
frequency_table[value] = 1
return frequency_table
Now we’ve got two parameters:
data_set
: the list of lists (your dataset)index
: the column we want to analyze
Let’s Use It
Let’s try generating a frequency table for the user_rating
column (index 7) in the apps_data
dataset:
ratings_ft = freq_table(apps_data, 7)
print(ratings_ft)
Why This Is Better
With this version of the function:
- You can analyze any column of any dataset.
- It’s easy to share and reuse in different projects.
- It avoids hardcoding and makes your code future-proof.
In the next section, we’ll see how Python lets us pass arguments to such functions in two different ways — keyword arguments and positional arguments — and how that affects your code.
Keyword vs Positional Arguments: What’s the Difference?
Now that our freq_table()
function accepts two parameters, let’s talk about how we pass values to those parameters when calling the function.
Python lets us do it in two main ways:
- Positional arguments
- Keyword arguments
Let’s see how both work — and when to use which.
Positional Arguments
With positional arguments, the order matters.
Here’s our function again:
def freq_table(data_set, index):
# ...
If you call it like this:
freq_table(apps_data, 7)
Python understands:
apps_data
goes todata_set
7
goes toindex
This is short and clean — but risky. If you switch the order by mistake:
freq_table(7, apps_data) # Wrong!
You’ll likely get an error (or worse, a bug you don’t notice right away).
Keyword Arguments
Keyword arguments are more explicit. You pass the values with their names:
freq_table(data_set=apps_data, index=7)
You can also change the order when using keywords:
freq_table(index=7, data_set=apps_data) # Still works!
This makes your code more readable and less error-prone, especially when you have many parameters.
Quick Example: Both Ways
# Positional (order matters)
ratings_ft = freq_table(apps_data, 7)
# Keyword (order doesn’t matter)
ratings_ft = freq_table(data_set=apps_data, index=7)
ratings_ft = freq_table(index=7, data_set=apps_data)
All three produce the same result.
Practice: Generating More Frequency Tables
Let’s create three frequency tables for different columns using both styles:
# Content rating column (index 10) – use positional
content_ratings_ft = freq_table(apps_data, 10)
# User rating column (index 7) – use keyword (standard order)
ratings_ft = freq_table(data_set=apps_data, index=7)
# Genre column (index 11) – use keyword (reversed order)
genres_ft = freq_table(index=11, data_set=apps_data)
Summary
Style | Syntax Example | Safe from bugs? |
---|---|---|
Positional | func(apps_data, 7) | ❌ Not always |
Keyword | func(data_set=apps_data, index=7) | ✅ Yes |
Use positional when the meaning is obvious and the function is short.
Use keyword when the function has many parameters or to prevent mistakes.
Next up — let’s combine functions to create more powerful operations!
Combining Functions: Building Blocks for Smarter Code
One of the best parts of learning to code is this: you don’t need to solve everything from scratch.
Once you write a function, you can reuse it — not just by calling it again, but even inside other functions.
In this section, we’ll explore how to combine functions together to build more complex, smarter tools.
Goal: Calculate the Mean of a Column
We want to write a function called mean()
that:
- Takes a dataset and a column index
- Extracts the values in that column
- Computes the mean (average)
- Returns the result
To do that, we’ll use functions we already built:
extract()
– gets a column’s valuesfind_sum()
– adds up a listfind_length()
– counts the items in a list
Let’s first revisit those helper functions.
def extract(data_set, index):
column = []
for row in data_set[1:]:
column.append(row[index])
return column
def find_sum(a_list):
a_sum = 0
for element in a_list:
a_sum += float(element) # Convert to float for numeric calculations
return a_sum
def find_length(a_list):
length = 0
for _ in a_list:
length += 1
return length
Writing the mean()
Function
Now let’s build the new function mean()
using the three helpers above.
def mean(data_set, index):
column = extract(data_set, index)
total = find_sum(column)
count = find_length(column)
return total / count
This function does everything we need:
- Extracts the column using
extract()
- Adds up the values using
find_sum()
- Counts how many values there are using
find_length()
- Divides total by count to get the average
Simple. Clean. Reusable.
Using It in Practice
Let’s say we want to find the average price of apps in the App Store dataset.
We can call:
avg_price = mean(apps_data, 4) # Column 4 is the price column
print(avg_price)
This gives us the mean price for all apps.
You can also reuse the same mean()
function to calculate the average rating:
avg_rating = mean(apps_data, 7) # Column 7 is the user rating column
print(avg_rating)
Why This Matters
By combining small functions into a bigger one, we:
- Avoid repeating code
- Keep each function simple and focused
- Build reusable tools for future analysis
You’ll often hear this approach called function composition or functional abstraction — it’s how professionals build complex software efficiently.
In the next section, we’ll talk about something every programmer faces: errors, also known as bugs — and how to debug them.
Debugging Functions: How to Track Down and Fix Errors
Now that you’re writing more complex functions, it’s only natural that you’ll run into problems. We call these problems bugs, and the process of finding and fixing them is called debugging.
In this section, we’ll practice debugging a real piece of code step by step, just like a professional would.
A Realistic Example with Bugs
Imagine you’re trying to compute the average price and average rating of apps using this code:
avg_price = mean(apps_data, 4)
avg_rating = mean(apps_data, 7)
But when you run it, Python throws an error.
Let’s look at the full code with bugs introduced:
def find_sum(a_list):
a_sum = 0
for element in a_list
a_sum += element
return a_sum
def mean(data_set, index):
column = extract(data_set, index)
return find_sm(column) / find_length(column)
This code won’t work. There are a few bugs here.
Step 1: Read the Error (Traceback)
When you run this code, you’ll likely get an error like:
File "your_script", line 16
for element in a_list
^
SyntaxError: expected ':'
This tells us:
- There’s a syntax error.
- It’s on the line:
for element in a_list
- We forgot a colon
:
at the end of thefor
loop
Step 2: Fix the Syntax Error
Let’s fix the missing colon:
for element in a_list: # ← fixed!
Now if we run it again, we’ll likely get a different error:
NameError: name 'find_sm' is not defined
This tells us:
- There’s a typo in the function name
find_sm()
should befind_sum()
Step 3: Fix the Function Call
Update the mean()
function:
return find_sum(column) / find_length(column)
Almost there. But you might still hit another error:
TypeError: unsupported operand type(s) for +=: 'int' and 'str'
This means:
- You’re trying to add a string to an integer
- We need to convert the elements to
float
Fix it like this inside find_sum()
:
for element in a_list:
a_sum += float(element)
Final Working Version
Here’s the corrected code:
def extract(data_set, index):
column = []
for row in data_set[1:]:
column.append(row[index])
return column
def find_sum(a_list):
a_sum = 0
for element in a_list:
a_sum += float(element)
return a_sum
def find_length(a_list):
length = 0
for _ in a_list:
length += 1
return length
def mean(data_set, index):
column = extract(data_set, index)
return find_sum(column) / find_length(column)
avg_price = mean(apps_data, 4)
avg_rating = mean(apps_data, 7)
Why This Skill Is Essential
As your functions grow, bugs are inevitable. But instead of feeling frustrated, take these steps:
- Read the traceback carefully
- Look at the line it points to
- Understand the error message (syntax, name, type, etc.)
- Fix one bug at a time
This is exactly what real developers do. And the more you practice it, the better you’ll become.
Review and Takeaways
You’ve made it through one of the most essential lessons in Python: writing reusable, flexible, and debuggable functions.
Whether you’re analyzing data or building full applications, functions will be your best friend in keeping your code organized, efficient, and easy to reuse.
Let’s go over what we learned.
What You Learned
✅ Creating Functions That Extract and Analyze Data
You learned to write custom functions like:
def extract(data_set, index):
...
And how to combine them to build something powerful:
def mean(data_set, index):
column = extract(data_set, index)
return find_sum(column) / find_length(column)
✅ Reusability Through Parameters
We extended our frequency table function so it could take any dataset and any column index:
def freq_table(data_set, index):
...
This allows us to run:
freq_table(apps_data, 7) # User ratings
freq_table(apps_data, 11) # App genres
✅ Keyword vs. Positional Arguments
We saw how these two forms work:
freq_table(apps_data, 10) # Positional
freq_table(data_set=apps_data, index=10) # Keyword
freq_table(index=10, data_set=apps_data) # Keyword (order doesn't matter)
But in positional arguments, order matters!
✅ Reusing Functions Inside Other Functions
We reused smaller tools like find_sum()
and find_length()
to build more complex logic. This is how real software is built: piece by piece.
def mean(data_set, index):
return find_sum(extract(data_set, index)) / find_length(extract(data_set, index))
✅ Debugging Like a Pro
We walked through real error messages (tracebacks) and fixed:
- Syntax mistakes (missing colons)
- Typos (misspelled function names)
- Type errors (string vs. float)
By the end of this, you’ve seen that debugging is a skill you can practice and master.
Summary Table
Concept | Example Code Snippet |
---|---|
Create function | def mean(data_set, index): |
Multiple parameters | def freq_table(data_set, index): |
Positional arguments | freq_table(apps_data, 7) |
Keyword arguments | freq_table(index=7, data_set=apps_data) |
Reuse functions | return find_sum(column) / find_length(column) |
Debugging | Fix one error at a time using tracebacks |
What’s Next?
Now that you’ve learned to build and reuse functions effectively, you’re ready to start applying them in real data projects. From now on, whenever you spot a repetitive task, think: “Can I write a function for this?”
And if something breaks? Don’t panic. Use the traceback. Follow the arrows. Fix the bugs.
You’ve got this. Let’s keep going!