Lesson 9 - Python Functions: Arguments, Parameters, and Debugging

In this lesson, we’ll take our understanding of Python functions a step further. You already know how to define and use functions. Now you’ll learn how to make them more powerful by passing arguments, handling multiple parameters, writing reusable code, and dealing with bugs through debugging.

By the end of this lesson, you’ll be able to:

  • Create more flexible functions using parameters
  • Pass arguments by position or by name
  • Reuse functions to make your code cleaner and smarter
  • Handle and fix common function-related bugs

Let’s get started with how to extract values from any column in a dataset using a function.


Extracting Values from Any Column

When working with real-world datasets, we often need to focus on just one column at a time — for instance, the genre or the content rating. Instead of manually writing the same logic over and over, we can write a function to extract values from any column we want.

Our dataset is a list of lists. Each row is an app, and each column is an attribute such as name, rating, size, and so on. To get all the values from a particular column, we need to:

  1. Loop over the dataset (skipping the header).
  2. Pick the value at the index we care about.
  3. Add that value to a new list.

Before we build complex functions, let’s work with something very common in data analysis: retrieving data from a specific column of a dataset.

About the Dataset

We’re using a dataset of over 7,000 iOS apps collected from the Apple App Store (Download Here). The data is stored as a list of lists in Python — meaning:

  • Each row is a list that represents one app.
  • Each column holds a different piece of information, such as app name, rating, size, genre, or content rating.

Here’s a simplified view of the dataset structure:

IndexColumn NameExample
0ID284882215
1Track NameFacebook
2Size (in bytes)389879808
4Price0.0
5Rating Count Total2974676
7User Rating3.5
10Content Rating4+
11GenreSocial Networking

The dataset is stored in a variable named apps_data. The first row contains the column headers, so we usually skip it when processing the data.


What Are We Trying to Do?

Let’s say you want to analyze a single column — maybe user ratings, content ratings, or app genres. Instead of rewriting similar code every time, we can create a reusable function that lets us extract values from any column, simply by specifying its index.

Manual Approach (Before Using Functions)

Here’s how you might normally extract the values of the “Content Rating” column (which is at index 10):

content_ratings = []

for row in apps_data[1:]:  # skip the header
    content_rating = row[10]
    content_ratings.append(content_rating)

That’s fine — but what if we want to do this for different columns repeatedly? We’d have to copy and tweak the code every time.

Let’s make this better using a function.


Building a Reusable Column Extraction Function

Here’s how we can wrap that logic inside a function called extract():

def extract(index):
    column = []
    for row in apps_data[1:]:
        value = row[index]
        column.append(value)
    return column

What this function does:

  • Takes one input: index, the column number we want to extract
  • Loops through the dataset (skipping the header)
  • Picks the value at the given index from each row
  • Adds that value to a new list
  • Returns the final list

Let’s try using this to extract the app genres:

genres = extract(11)

Now, genres will be a list containing the genre of every app in the dataset — like this:

['Social Networking', 'Photo & Video', 'Games', 'Music', ...]

Simple, clean, and reusable!


Creating Frequency Tables from a List

In data analysis, a frequency table is a simple but powerful tool. It tells us how often each unique value appears in a dataset.

For example, if you have a list of app content ratings like ['4+', '4+', '12+', '17+', '4+'], a frequency table will tell you:

  • '4+' appears 3 times
  • '12+' appears 1 time
  • '17+' appears 1 time

Why Use Frequency Tables?

They’re a great way to answer questions like:

  • What’s the most common content rating?
  • How many apps belong to each genre?
  • What’s the distribution of user ratings?

Once we extract a column like the one with genres or ratings, we can pass it into a new function that creates a frequency table for us.


Building the Frequency Table Function

Let’s write a function named freq_table() that takes in a list and returns a dictionary showing the frequency of each unique value in that list:

def freq_table(column):
    frequency_table = {}

    for value in column:
        if value in frequency_table:
            frequency_table[value] += 1
        else:
            frequency_table[value] = 1

    return frequency_table

Here’s what’s happening in this function:

  • We create an empty dictionary called frequency_table.
  • Then we loop through each item in the input list.
  • If that item already exists as a key in the dictionary, we add 1 to its count.
  • If not, we add it to the dictionary with an initial count of 1.
  • Finally, we return the dictionary with all the counts.

Using Both Functions Together

Now we can combine both extract() and freq_table() to build a frequency table for any column.

Let’s say we want to see how many apps exist in each genre. The “prime_genre” column is at index 11:

genres = extract(11)
genres_ft = freq_table(genres)

Let’s print the result:

print(genres_ft)

You’ll get something like this:

{
 'Games': 3862,
 'Entertainment': 535,
 'Education': 453,
 'Photo & Video': 349,
 ...
}

Now we can clearly see which genres are most common — no manual counting required!


A More Efficient Approach: One Function to Do It All

In the previous section, we used two separate functions to build a frequency table:

  1. extract() – to get all values from a specific column
  2. freq_table() – to count the frequency of those values

That works well, but we can simplify things by combining both steps into a single function. This makes our code cleaner and easier to reuse.


Writing One Function Instead of Two

Let’s say we want to generate a frequency table for the column at index 10 (which stores content ratings like '4+', '12+', etc.). Instead of first extracting the values, we can just count them directly while looping through the dataset.

Here’s how we can do that:

def freq_table(index):
    frequency_table = {}

    for row in apps_data[1:]:  # skip header
        value = row[index]
        if value in frequency_table:
            frequency_table[value] += 1
        else:
            frequency_table[value] = 1

    return frequency_table

Let’s break this down:

  • This function accepts one input: the column index we want to analyze.
  • We loop through the dataset (ignoring the header row).
  • For each row, we grab the value in the column we’re interested in.
  • We update the dictionary with the count of that value.

Try It Out

Let’s use our new freq_table() function to analyze user ratings, which are in column index 7:

ratings_ft = freq_table(7)
print(ratings_ft)

This might give you something like:

{
 '4.5': 2663,
 '4.0': 1553,
 '3.5': 702,
 '3.0': 455,
 ...
}

Now you can instantly see how many apps have each rating. No need to extract the column manually anymore!


In the next section, we’ll take things further and make this function even more reusable by allowing it to work on different datasets — not just apps_data.



Making Our Function Truly Reusable with Multiple Parameters

So far, our freq_table() function works well, but it has a limitation — it only works with the apps_data dataset. What if you want to reuse it for another dataset in a different project?

To fix this, we’ll improve the function by adding a second parameter: one for the dataset and one for the column index.


Why Multiple Parameters?

Let’s say you have two datasets:

  • apps_data – iOS apps dataset
  • games_data – a different dataset about games

If your function is hardcoded to only work with apps_data, it becomes useless for anything else. But if you pass the dataset as a parameter, it becomes flexible and reusable.


Updated Function: More Powerful, Still Simple

Here’s how we define the improved version:

def freq_table(data_set, index):
    frequency_table = {}

    for row in data_set[1:]:  # skip header
        value = row[index]
        if value in frequency_table:
            frequency_table[value] += 1
        else:
            frequency_table[value] = 1

    return frequency_table

Now we’ve got two parameters:

  • data_set: the list of lists (your dataset)
  • index: the column we want to analyze

Let’s Use It

Let’s try generating a frequency table for the user_rating column (index 7) in the apps_data dataset:

ratings_ft = freq_table(apps_data, 7)
print(ratings_ft)

Why This Is Better

With this version of the function:

  • You can analyze any column of any dataset.
  • It’s easy to share and reuse in different projects.
  • It avoids hardcoding and makes your code future-proof.

In the next section, we’ll see how Python lets us pass arguments to such functions in two different ways — keyword arguments and positional arguments — and how that affects your code.



Keyword vs Positional Arguments: What’s the Difference?

Now that our freq_table() function accepts two parameters, let’s talk about how we pass values to those parameters when calling the function.

Python lets us do it in two main ways:

  • Positional arguments
  • Keyword arguments

Let’s see how both work — and when to use which.


Positional Arguments

With positional arguments, the order matters.

Here’s our function again:

def freq_table(data_set, index):
    # ...

If you call it like this:

freq_table(apps_data, 7)

Python understands:

  • apps_data goes to data_set
  • 7 goes to index

This is short and clean — but risky. If you switch the order by mistake:

freq_table(7, apps_data)  # Wrong!

You’ll likely get an error (or worse, a bug you don’t notice right away).


Keyword Arguments

Keyword arguments are more explicit. You pass the values with their names:

freq_table(data_set=apps_data, index=7)

You can also change the order when using keywords:

freq_table(index=7, data_set=apps_data)  # Still works!

This makes your code more readable and less error-prone, especially when you have many parameters.


Quick Example: Both Ways

# Positional (order matters)
ratings_ft = freq_table(apps_data, 7)

# Keyword (order doesn’t matter)
ratings_ft = freq_table(data_set=apps_data, index=7)
ratings_ft = freq_table(index=7, data_set=apps_data)

All three produce the same result.


Practice: Generating More Frequency Tables

Let’s create three frequency tables for different columns using both styles:

# Content rating column (index 10) – use positional
content_ratings_ft = freq_table(apps_data, 10)

# User rating column (index 7) – use keyword (standard order)
ratings_ft = freq_table(data_set=apps_data, index=7)

# Genre column (index 11) – use keyword (reversed order)
genres_ft = freq_table(index=11, data_set=apps_data)

Summary

StyleSyntax ExampleSafe from bugs?
Positionalfunc(apps_data, 7)❌ Not always
Keywordfunc(data_set=apps_data, index=7)✅ Yes

Use positional when the meaning is obvious and the function is short.

Use keyword when the function has many parameters or to prevent mistakes.

Next up — let’s combine functions to create more powerful operations!


Combining Functions: Building Blocks for Smarter Code

One of the best parts of learning to code is this: you don’t need to solve everything from scratch.

Once you write a function, you can reuse it — not just by calling it again, but even inside other functions.

In this section, we’ll explore how to combine functions together to build more complex, smarter tools.


Goal: Calculate the Mean of a Column

We want to write a function called mean() that:

  1. Takes a dataset and a column index
  2. Extracts the values in that column
  3. Computes the mean (average)
  4. Returns the result

To do that, we’ll use functions we already built:

  • extract() – gets a column’s values
  • find_sum() – adds up a list
  • find_length() – counts the items in a list

Let’s first revisit those helper functions.

def extract(data_set, index):
    column = []
    for row in data_set[1:]:
        column.append(row[index])
    return column

def find_sum(a_list):
    a_sum = 0
    for element in a_list:
        a_sum += float(element)  # Convert to float for numeric calculations
    return a_sum

def find_length(a_list):
    length = 0
    for _ in a_list:
        length += 1
    return length

Writing the mean() Function

Now let’s build the new function mean() using the three helpers above.

def mean(data_set, index):
    column = extract(data_set, index)
    total = find_sum(column)
    count = find_length(column)
    return total / count

This function does everything we need:

  • Extracts the column using extract()
  • Adds up the values using find_sum()
  • Counts how many values there are using find_length()
  • Divides total by count to get the average

Simple. Clean. Reusable.


Using It in Practice

Let’s say we want to find the average price of apps in the App Store dataset.

We can call:

avg_price = mean(apps_data, 4)  # Column 4 is the price column
print(avg_price)

This gives us the mean price for all apps.

You can also reuse the same mean() function to calculate the average rating:

avg_rating = mean(apps_data, 7)  # Column 7 is the user rating column
print(avg_rating)

Why This Matters

By combining small functions into a bigger one, we:

  • Avoid repeating code
  • Keep each function simple and focused
  • Build reusable tools for future analysis

You’ll often hear this approach called function composition or functional abstraction — it’s how professionals build complex software efficiently.


In the next section, we’ll talk about something every programmer faces: errors, also known as bugs — and how to debug them.


Debugging Functions: How to Track Down and Fix Errors

Now that you’re writing more complex functions, it’s only natural that you’ll run into problems. We call these problems bugs, and the process of finding and fixing them is called debugging.

In this section, we’ll practice debugging a real piece of code step by step, just like a professional would.


A Realistic Example with Bugs

Imagine you’re trying to compute the average price and average rating of apps using this code:

avg_price = mean(apps_data, 4)
avg_rating = mean(apps_data, 7)

But when you run it, Python throws an error.

Let’s look at the full code with bugs introduced:

def find_sum(a_list):
    a_sum = 0
    for element in a_list
        a_sum += element
    return a_sum

def mean(data_set, index):
    column = extract(data_set, index)
    return find_sm(column) / find_length(column)

This code won’t work. There are a few bugs here.


Step 1: Read the Error (Traceback)

When you run this code, you’ll likely get an error like:

File "your_script", line 16
    for element in a_list
                        ^
SyntaxError: expected ':'

This tells us:

  • There’s a syntax error.
  • It’s on the line: for element in a_list
  • We forgot a colon : at the end of the for loop

Step 2: Fix the Syntax Error

Let’s fix the missing colon:

for element in a_list:  # ← fixed!

Now if we run it again, we’ll likely get a different error:

NameError: name 'find_sm' is not defined

This tells us:

  • There’s a typo in the function name
  • find_sm() should be find_sum()

Step 3: Fix the Function Call

Update the mean() function:

return find_sum(column) / find_length(column)

Almost there. But you might still hit another error:

TypeError: unsupported operand type(s) for +=: 'int' and 'str'

This means:

  • You’re trying to add a string to an integer
  • We need to convert the elements to float

Fix it like this inside find_sum():

for element in a_list:
    a_sum += float(element)

Final Working Version

Here’s the corrected code:

def extract(data_set, index):
    column = []
    for row in data_set[1:]:
        column.append(row[index])
    return column

def find_sum(a_list):
    a_sum = 0
    for element in a_list:
        a_sum += float(element)
    return a_sum

def find_length(a_list):
    length = 0
    for _ in a_list:
        length += 1
    return length

def mean(data_set, index):
    column = extract(data_set, index)
    return find_sum(column) / find_length(column)

avg_price = mean(apps_data, 4)
avg_rating = mean(apps_data, 7)

Why This Skill Is Essential

As your functions grow, bugs are inevitable. But instead of feeling frustrated, take these steps:

  • Read the traceback carefully
  • Look at the line it points to
  • Understand the error message (syntax, name, type, etc.)
  • Fix one bug at a time

This is exactly what real developers do. And the more you practice it, the better you’ll become.



Review and Takeaways

You’ve made it through one of the most essential lessons in Python: writing reusable, flexible, and debuggable functions.

Whether you’re analyzing data or building full applications, functions will be your best friend in keeping your code organized, efficient, and easy to reuse.

Let’s go over what we learned.


What You Learned

✅ Creating Functions That Extract and Analyze Data

You learned to write custom functions like:

def extract(data_set, index):
    ...

And how to combine them to build something powerful:

def mean(data_set, index):
    column = extract(data_set, index)
    return find_sum(column) / find_length(column)

✅ Reusability Through Parameters

We extended our frequency table function so it could take any dataset and any column index:

def freq_table(data_set, index):
    ...

This allows us to run:

freq_table(apps_data, 7)      # User ratings
freq_table(apps_data, 11)     # App genres

✅ Keyword vs. Positional Arguments

We saw how these two forms work:

freq_table(apps_data, 10)                        # Positional
freq_table(data_set=apps_data, index=10)         # Keyword
freq_table(index=10, data_set=apps_data)         # Keyword (order doesn't matter)

But in positional arguments, order matters!


✅ Reusing Functions Inside Other Functions

We reused smaller tools like find_sum() and find_length() to build more complex logic. This is how real software is built: piece by piece.

def mean(data_set, index):
    return find_sum(extract(data_set, index)) / find_length(extract(data_set, index))

✅ Debugging Like a Pro

We walked through real error messages (tracebacks) and fixed:

  • Syntax mistakes (missing colons)
  • Typos (misspelled function names)
  • Type errors (string vs. float)

By the end of this, you’ve seen that debugging is a skill you can practice and master.


Summary Table

ConceptExample Code Snippet
Create functiondef mean(data_set, index):
Multiple parametersdef freq_table(data_set, index):
Positional argumentsfreq_table(apps_data, 7)
Keyword argumentsfreq_table(index=7, data_set=apps_data)
Reuse functionsreturn find_sum(column) / find_length(column)
DebuggingFix one error at a time using tracebacks

What’s Next?

Now that you’ve learned to build and reuse functions effectively, you’re ready to start applying them in real data projects. From now on, whenever you spot a repetitive task, think: “Can I write a function for this?”

And if something breaks? Don’t panic. Use the traceback. Follow the arrows. Fix the bugs.

You’ve got this. Let’s keep going!