Lesson 1 - Understanding Linear and Nonlinear Functions
Welcome to the Math of Machine Learning
This lesson opens the math module of the course. Before you can understand how a model learns, you need to understand the objects that models are built from: functions. Here you will learn what a function is, how linear functions behave with their straight lines and constant slope, and how nonlinear functions curve because their slope keeps changing. You will compute and plot both with NumPy and Matplotlib so the ideas feel concrete, not abstract.
By the end of this lesson, you will be able to:
- Explain what a mathematical function is and why functions are central to machine learning
- Write any linear function in the form and identify its slope and y-intercept
- Compute the slope of a line as a rate of change between two points
- Distinguish a linear function from a nonlinear one, and explain why a nonlinear function’s slope changes
- Use NumPy and Matplotlib to evaluate and visualize both kinds of functions
You should be comfortable with basic Python and have NumPy and Matplotlib installed. No prior calculus is required. Let’s begin.
Why Functions Matter in Machine Learning
In the foundations module, you trained a k-nearest neighbors model. That algorithm is easy to picture because it stores the training data and compares new points to it directly. But that simplicity has a cost: the “model” is the entire dataset, so every prediction means scanning all the training examples again. That does not scale.
Most of the algorithms you will meet from here on work differently. Instead of memorizing data, they summarize it as a mathematical function. Training means finding the function that best captures how the inputs relate to the output. Once you have that function, predicting is cheap: you plug an input in and read the output out.
A function is a rule that takes an input and returns exactly one output. You write , which reads as “y is a function of x.” You feed in , the function does something to it, and out comes . That is the whole idea, and almost every model in this course is, at heart, a function from features to a prediction.
To make progress, you first need to understand the two broad families of functions: linear and nonlinear. We will study linear functions first because they are the simplest, then build up to curves.
Functions are the model
When you hear that a model has “learned,” it usually means an algorithm has chosen the numbers inside a function so that the function fits the data well. Understanding functions, slopes, and how outputs respond to inputs is therefore not optional background. It is the vocabulary of the entire field.
Linear Functions
A linear function draws a straight line. Every linear function can be written in the same compact form:
Here and are the variables (the input and the output), while and are fixed numbers that define this particular line. Pick values for and and you have pinned down one specific line. Both and are linear functions; the second is just the special case where .
Let’s focus on . This function takes any , multiplies it by 2, then adds 1. You can build a small table of inputs and outputs by hand:
| x | y = 2x + 1 |
|---|---|
| -2 | -3 |
| -1 | -1 |
| 0 | 1 |
| 1 | 3 |
| 2 | 5 |
| 3 | 7 |
Notice the pattern in the right column. Every time increases by 1, increases by exactly 2. That constant step is the defining property of a linear function, and it has a name: the slope.
Computing a Linear Function with NumPy
You do not want to fill in tables by hand for hundreds of points. NumPy lets you evaluate a function on a whole array of inputs at once. The np.linspace function creates evenly spaced values over a range, and arithmetic on a NumPy array applies to every element.
import numpy as np
# 6 evenly spaced x-values from -2 to 3
x = np.linspace(-2, 3, 6)
y = 2 * x + 1
print("x:", x)
print("y:", y)
# Output:
# x: [-2. -1. 0. 1. 2. 3.]
# y: [-3. -1. 1. 3. 5. 7.]These are exactly the values from the table above, computed in two lines. The expression 2 * x + 1 is the function written in code, with and .
Slope and y-intercept
The two numbers and each control one geometric feature of the line.
The y-intercept is : the value of where the line crosses the y-axis. The crossing happens when , and plugging in gives:
So the line crosses the y-axis at height 1. Change and you slide the whole line up or down without tilting it.
The slope is : it controls how steep the line is. A positive slope tilts the line upward as you move right, a negative slope tilts it downward, and a slope of 0 makes the line perfectly flat. The larger the magnitude of , the steeper the climb or descent.
Slope as a Rate of Change
The most useful way to think about slope is as a rate of change: how much changes for a given change in . Using the Greek letter delta () to mean “change in,” the slope is:
Take any two points on the line , say and , and compute the rise over the run:
The slope is 2, which matches the in the equation. The remarkable thing about a linear function is that you get the same answer no matter which two points you choose. Try and :
Same slope. A linear function has one constant slope everywhere. That is what makes it a straight line.
When you talk about two specific points, the usual notation is and , giving the general slope formula:
A slope Function in Python
You can capture this formula in a small reusable function. It takes the coordinates of two points and returns the rate of change between them.
def slope(x1, y1, x2, y2):
return (y2 - y1) / (x2 - x1)
# Two points on y = 2x + 1
print(slope(1, 3, 3, 7)) # between (1, 3) and (3, 7)
print(slope(-2, -3, 0, 1)) # between (-2, -3) and (0, 1)
# Output:
# 2.0
# 2.0Both calls return 2.0, confirming numerically what the algebra told us: on a line, the slope is the same between any pair of points.
Rise over run
A quick way to remember the slope formula is “rise over run.” The rise is how far you move vertically (), and the run is how far you move horizontally (). Slope is rise divided by run. If the line goes up as you read left to right, the rise is positive and so is the slope.
Nonlinear Functions
Not every relationship is a straight line. Consider this function:
This does not fit the form, because is raised to the power 2, not 1. It is a nonlinear function. Nonlinear functions draw curves rather than straight lines, and their outputs are not proportional to their inputs: a steady increase in does not produce a steady increase in .
Whenever appears with a power other than 1 (or inside a root, a fraction, and so on), the function is nonlinear. Here are a few more examples:
Why the Slope Changes
Build a table for and watch what happens to the steps in :
| x | y = x² − 2 | change in y |
|---|---|---|
| 0 | -2 | — |
| 1 | -1 | +1 |
| 2 | 2 | +3 |
| 3 | 7 | +5 |
For a linear function, that “change in y” column would be constant. Here it grows: +1, then +3, then +5. Because the step keeps changing, there is no single slope for the whole curve. The slope between and is
but between and it is
Two different pairs of points give two different slopes. A nonlinear function has a changing slope, and that single fact is what separates it from a line.
Visualizing Both Functions
A picture makes the contrast obvious. The chart below plots the linear function as a straight line and the nonlinear function as a curve, on the same axes.
You can reproduce a plot like this yourself with NumPy and Matplotlib. The key is to evaluate each function on a dense array of -values so the curve looks smooth.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-3, 3, 100)
y_linear = 2 * x + 1 # linear: constant slope
y_nonlinear = x**2 - 2 # nonlinear: changing slope
plt.plot(x, y_linear, label="y = 2x + 1")
plt.plot(x, y_nonlinear, label="y = x^2 - 2")
plt.axhline(0, color="gray", linewidth=0.8)
plt.axvline(0, color="gray", linewidth=0.8)
plt.legend()
plt.show()The straight line never bends; the parabola sweeps down to a low point and back up. If you tried to summarize the parabola with a single slope, you would be wrong almost everywhere, because the steepness is different at every point.
Two points are not enough for a curve
On a straight line, sampling two points tells you the slope everywhere. On a curve, it does not: the slope between two points depends entirely on which two you pick. This is exactly why curves are harder to reason about, and why later lessons introduce tools to measure the slope at a single point.
Why Machine Learning Needs Nonlinearity
If linear functions are so much simpler, why bother with curves at all? Because most real relationships are not straight lines.
A linear model assumes that every unit increase in an input changes the output by the same fixed amount. Sometimes that is a fine approximation. But think about how the value of a house responds to its size, how a drug’s effect responds to its dose, or how the trajectory of a thrown ball rises and then falls. In each case the rate of change is not constant: the effect speeds up, slows down, or reverses. A straight line simply cannot bend to match those patterns.
Nonlinear functions can. By allowing the slope to change, they bend and curve to fit complicated data. This flexibility is the source of the power behind polynomial models, decision boundaries that wrap around clusters, and the deep neural networks that recognize images and language. Strip the nonlinearity out of a neural network and, no matter how many layers you stack, it collapses back into a single straight-line relationship.
So the linear-versus-nonlinear distinction is not academic. It marks the line between models that can only fit straight relationships and models that can capture the curved reality of most data.
import numpy as np
# Linear: equal steps in x -> equal steps in y
x = np.array([0, 1, 2, 3])
linear = 2 * x + 1
print("linear steps:", np.diff(linear))
# Output: linear steps: [2 2 2]
# Nonlinear: equal steps in x -> growing steps in y
nonlinear = x**2 - 2
print("nonlinear steps:", np.diff(nonlinear))
# Output: nonlinear steps: [1 3 5]np.diff reports the difference between consecutive values. For the line the differences are all 2, the constant slope. For the curve they grow as 1, 3, 5, a direct numerical fingerprint of a changing slope.
Practice Exercises
Try these before checking the hints.
Exercise 1: Identify Slope and y-intercept
Given the linear function , state its slope and its y-intercept without running any code. Then confirm the y-intercept by evaluating the function at in Python.
def f(x):
return -4 * x + 7
# Your code here: evaluate f at x = 0Hint
Compare to the general form . The slope is the number multiplying , and the y-intercept is the standalone constant. Calling f(0) should print 7, the y-intercept, because the term vanishes when .
Exercise 2: Confirm a Constant Slope
Reuse the slope(x1, y1, x2, y2) function from this lesson. Pick three different pairs of points on the line and confirm that the slope is the same every time.
def slope(x1, y1, x2, y2):
return (y2 - y1) / (x2 - x1)
# Your code here: compute the slope for three pairs of points on y = -4x + 7Hint
First generate points by plugging x-values into the function: for you get , for you get , for you get . Feed any two of these into slope. Every pair should return -4.0, because a linear function has one constant slope.
Exercise 3: Show That a Curve’s Slope Changes
For the nonlinear function , use the slope function to compute the slope between and , and then between and . Compare the two results.
# Your code here: reuse slope() and the function y = x**2 - 2Hint
Compute each y-value first: at , ; at , ; at , ; at , . The first slope is and the second is . Different answers prove the slope is not constant on a curve.
Summary
You have met the most fundamental objects in the math of machine learning: functions, and the split between linear and nonlinear ones. Let’s review.
Key Concepts
Functions
- A function is a rule that maps each input to exactly one output , written
- Most machine learning models are functions; training means choosing the numbers inside the function so it fits the data
Linear Functions
- Every linear function has the form and draws a straight line
- is the slope (steepness) and is the y-intercept (where the line crosses the y-axis at )
- Slope is a rate of change, , the rise over the run
- A linear function has one constant slope: the rate of change is identical between any two points
Nonlinear Functions
- A function is nonlinear when appears with a power other than 1 (or in a root, fraction, and so on), and it draws a curve
- A nonlinear function has a changing slope: the rate of change differs from point to point, so two points are not enough to describe it
- Examples include , , and
Computing with NumPy
np.linspace(start, stop, num)creates evenly spaced inputs- Arithmetic like
2 * x + 1orx**2 - 2applies to every element of an array at once np.diffreveals constant steps for a line and growing steps for a curve
Why This Matters
The distinction you learned here runs through the entire course. Linear functions are the backbone of linear regression and the building blocks of larger models, and their constant slope is what makes them easy to reason about. But the world rarely moves in straight lines, and nonlinearity is what lets models bend to fit real, curved data, from house prices to neural networks.
There is also a deeper thread waiting. On a straight line you can read off the slope from any two points, but on a curve the slope changes everywhere, and a single pair of points cannot pin it down. Measuring how steep a curve is at one exact point is the central question of calculus, and answering it is what the next lessons build toward.
Next Steps
You now understand functions, slope, and the difference between straight lines and curves. The natural next question is how to talk precisely about what a function is doing as its input gets arbitrarily close to a particular value, which is the idea of a limit.
Continue to Lesson 2 - Understanding Limits
Learn how limits describe a function's behavior as the input approaches a value, even where the function is undefined.
Back to Module Overview
Return to the Math Foundations module overview.
Keep Building Your Skills
You have taken the first step into the mathematics that powers machine learning. Functions are the language every model speaks, and the linear-versus-nonlinear split you learned here will resurface in nearly every lesson that follows. Keep the two mental pictures close: a straight line with one steady slope, and a curve whose steepness shifts at every point. As the math grows, those two images will keep the new ideas grounded.