Lesson 1 - Understanding Linear and Nonlinear Functions

Welcome to the Math of Machine Learning

This lesson opens the math module of the course. Before you can understand how a model learns, you need to understand the objects that models are built from: functions. Here you will learn what a function is, how linear functions behave with their straight lines and constant slope, and how nonlinear functions curve because their slope keeps changing. You will compute and plot both with NumPy and Matplotlib so the ideas feel concrete, not abstract.

By the end of this lesson, you will be able to:

  • Explain what a mathematical function is and why functions are central to machine learning
  • Write any linear function in the form y=mx+b y = mx + b and identify its slope and y-intercept
  • Compute the slope of a line as a rate of change between two points
  • Distinguish a linear function from a nonlinear one, and explain why a nonlinear function’s slope changes
  • Use NumPy and Matplotlib to evaluate and visualize both kinds of functions

You should be comfortable with basic Python and have NumPy and Matplotlib installed. No prior calculus is required. Let’s begin.


Why Functions Matter in Machine Learning

In the foundations module, you trained a k-nearest neighbors model. That algorithm is easy to picture because it stores the training data and compares new points to it directly. But that simplicity has a cost: the “model” is the entire dataset, so every prediction means scanning all the training examples again. That does not scale.

Most of the algorithms you will meet from here on work differently. Instead of memorizing data, they summarize it as a mathematical function. Training means finding the function that best captures how the inputs relate to the output. Once you have that function, predicting is cheap: you plug an input in and read the output out.

A function is a rule that takes an input and returns exactly one output. You write y=f(x) y = f(x) , which reads as “y is a function of x.” You feed in x x , the function does something to it, and out comes y y . That is the whole idea, and almost every model in this course is, at heart, a function from features to a prediction.

To make progress, you first need to understand the two broad families of functions: linear and nonlinear. We will study linear functions first because they are the simplest, then build up to curves.

Functions are the model

When you hear that a model has “learned,” it usually means an algorithm has chosen the numbers inside a function so that the function fits the data well. Understanding functions, slopes, and how outputs respond to inputs is therefore not optional background. It is the vocabulary of the entire field.


Linear Functions

A linear function draws a straight line. Every linear function can be written in the same compact form:

y=mx+b y = mx + b

Here x x and y y are the variables (the input and the output), while m m and b b are fixed numbers that define this particular line. Pick values for m m and b b and you have pinned down one specific line. Both y=2x+1 y = 2x + 1 and y=5 y = 5 are linear functions; the second is just the special case where m=0 m = 0 .

Let’s focus on y=2x+1 y = 2x + 1 . This function takes any x x , multiplies it by 2, then adds 1. You can build a small table of inputs and outputs by hand:

xy = 2x + 1
-2-3
-1-1
01
13
25
37

Notice the pattern in the right column. Every time x x increases by 1, y y increases by exactly 2. That constant step is the defining property of a linear function, and it has a name: the slope.

Computing a Linear Function with NumPy

You do not want to fill in tables by hand for hundreds of points. NumPy lets you evaluate a function on a whole array of inputs at once. The np.linspace function creates evenly spaced values over a range, and arithmetic on a NumPy array applies to every element.

import numpy as np

# 6 evenly spaced x-values from -2 to 3
x = np.linspace(-2, 3, 6)
y = 2 * x + 1

print("x:", x)
print("y:", y)
# Output:
# x: [-2. -1.  0.  1.  2.  3.]
# y: [-3. -1.  1.  3.  5.  7.]

These are exactly the values from the table above, computed in two lines. The expression 2 * x + 1 is the function y=2x+1 y = 2x + 1 written in code, with m=2 m = 2 and b=1 b = 1 .


Slope and y-intercept

The two numbers m m and b b each control one geometric feature of the line.

The y-intercept is b b : the value of y y where the line crosses the y-axis. The crossing happens when x=0 x = 0 , and plugging in x=0 x = 0 gives:

f(0)=2(0)+1=1 f(0) = 2(0) + 1 = 1

So the line y=2x+1 y = 2x + 1 crosses the y-axis at height 1. Change b b and you slide the whole line up or down without tilting it.

The slope is m m : it controls how steep the line is. A positive slope tilts the line upward as you move right, a negative slope tilts it downward, and a slope of 0 makes the line perfectly flat. The larger the magnitude of m m , the steeper the climb or descent.

Slope as a Rate of Change

The most useful way to think about slope is as a rate of change: how much y y changes for a given change in x x . Using the Greek letter delta (Δ \Delta ) to mean “change in,” the slope is:

m=change in ychange in x=ΔyΔx m = \frac{\text{change in } y}{\text{change in } x} = \frac{\Delta y}{\Delta x}

Take any two points on the line y=2x+1 y = 2x + 1 , say (1,3) (1, 3) and (3,7) (3, 7) , and compute the rise over the run:

m=ΔyΔx=7331=42=2 m = \frac{\Delta y}{\Delta x} = \frac{7 - 3}{3 - 1} = \frac{4}{2} = 2

The slope is 2, which matches the m m in the equation. The remarkable thing about a linear function is that you get the same answer no matter which two points you choose. Try (2,3) (-2, -3) and (0,1) (0, 1) :

m=1(3)0(2)=42=2 m = \frac{1 - (-3)}{0 - (-2)} = \frac{4}{2} = 2

Same slope. A linear function has one constant slope everywhere. That is what makes it a straight line.

When you talk about two specific points, the usual notation is (x1,y1) (x_1, y_1) and (x2,y2) (x_2, y_2) , giving the general slope formula:

m=y2y1x2x1 m = \frac{y_2 - y_1}{x_2 - x_1}

A slope Function in Python

You can capture this formula in a small reusable function. It takes the coordinates of two points and returns the rate of change between them.

def slope(x1, y1, x2, y2):
    return (y2 - y1) / (x2 - x1)

# Two points on y = 2x + 1
print(slope(1, 3, 3, 7))   # between (1, 3) and (3, 7)
print(slope(-2, -3, 0, 1)) # between (-2, -3) and (0, 1)
# Output:
# 2.0
# 2.0

Both calls return 2.0, confirming numerically what the algebra told us: on a line, the slope is the same between any pair of points.

Rise over run

A quick way to remember the slope formula is “rise over run.” The rise is how far you move vertically (Δy \Delta y ), and the run is how far you move horizontally (Δx \Delta x ). Slope is rise divided by run. If the line goes up as you read left to right, the rise is positive and so is the slope.


Nonlinear Functions

Not every relationship is a straight line. Consider this function:

y=x22 y = x^2 - 2

This does not fit the y=mx+b y = mx + b form, because x x is raised to the power 2, not 1. It is a nonlinear function. Nonlinear functions draw curves rather than straight lines, and their outputs are not proportional to their inputs: a steady increase in x x does not produce a steady increase in y y .

Whenever x x appears with a power other than 1 (or inside a root, a fraction, and so on), the function is nonlinear. Here are a few more examples:

y=x3 y = x^3 y=x3+3x2+2x1 y = x^3 + 3x^2 + 2x - 1 y=x y = \sqrt{x}

Why the Slope Changes

Build a table for y=x22 y = x^2 - 2 and watch what happens to the steps in y y :

xy = x² − 2change in y
0-2
1-1+1
22+3
37+5

For a linear function, that “change in y” column would be constant. Here it grows: +1, then +3, then +5. Because the step keeps changing, there is no single slope for the whole curve. The slope between x=1 x = 1 and x=2 x = 2 is

m=2(1)21=3, m = \frac{2 - (-1)}{2 - 1} = 3,

but between x=2 x = 2 and x=3 x = 3 it is

m=7232=5. m = \frac{7 - 2}{3 - 2} = 5.

Two different pairs of points give two different slopes. A nonlinear function has a changing slope, and that single fact is what separates it from a line.

Visualizing Both Functions

A picture makes the contrast obvious. The chart below plots the linear function y=2x+1 y = 2x + 1 as a straight line and the nonlinear function y=x22 y = x^2 - 2 as a curve, on the same axes.

A straight line for y equals 2x plus 1 next to an upward curving parabola for y equals x squared minus 2
The linear function y = 2x + 1 keeps a constant slope, while the nonlinear function y = x² − 2 curves because its slope changes from point to point.

You can reproduce a plot like this yourself with NumPy and Matplotlib. The key is to evaluate each function on a dense array of x x -values so the curve looks smooth.

import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(-3, 3, 100)

y_linear = 2 * x + 1      # linear: constant slope
y_nonlinear = x**2 - 2    # nonlinear: changing slope

plt.plot(x, y_linear, label="y = 2x + 1")
plt.plot(x, y_nonlinear, label="y = x^2 - 2")
plt.axhline(0, color="gray", linewidth=0.8)
plt.axvline(0, color="gray", linewidth=0.8)
plt.legend()
plt.show()

The straight line never bends; the parabola sweeps down to a low point and back up. If you tried to summarize the parabola with a single slope, you would be wrong almost everywhere, because the steepness is different at every point.

Two points are not enough for a curve

On a straight line, sampling two points tells you the slope everywhere. On a curve, it does not: the slope between two points depends entirely on which two you pick. This is exactly why curves are harder to reason about, and why later lessons introduce tools to measure the slope at a single point.


Why Machine Learning Needs Nonlinearity

If linear functions are so much simpler, why bother with curves at all? Because most real relationships are not straight lines.

A linear model assumes that every unit increase in an input changes the output by the same fixed amount. Sometimes that is a fine approximation. But think about how the value of a house responds to its size, how a drug’s effect responds to its dose, or how the trajectory of a thrown ball rises and then falls. In each case the rate of change is not constant: the effect speeds up, slows down, or reverses. A straight line simply cannot bend to match those patterns.

Nonlinear functions can. By allowing the slope to change, they bend and curve to fit complicated data. This flexibility is the source of the power behind polynomial models, decision boundaries that wrap around clusters, and the deep neural networks that recognize images and language. Strip the nonlinearity out of a neural network and, no matter how many layers you stack, it collapses back into a single straight-line relationship.

So the linear-versus-nonlinear distinction is not academic. It marks the line between models that can only fit straight relationships and models that can capture the curved reality of most data.

import numpy as np

# Linear: equal steps in x -> equal steps in y
x = np.array([0, 1, 2, 3])
linear = 2 * x + 1
print("linear steps:", np.diff(linear))
# Output: linear steps: [2 2 2]

# Nonlinear: equal steps in x -> growing steps in y
nonlinear = x**2 - 2
print("nonlinear steps:", np.diff(nonlinear))
# Output: nonlinear steps: [1 3 5]

np.diff reports the difference between consecutive values. For the line the differences are all 2, the constant slope. For the curve they grow as 1, 3, 5, a direct numerical fingerprint of a changing slope.


Practice Exercises

Try these before checking the hints.

Exercise 1: Identify Slope and y-intercept

Given the linear function y=4x+7 y = -4x + 7 , state its slope and its y-intercept without running any code. Then confirm the y-intercept by evaluating the function at x=0 x = 0 in Python.

def f(x):
    return -4 * x + 7

# Your code here: evaluate f at x = 0

Hint

Compare y=4x+7 y = -4x + 7 to the general form y=mx+b y = mx + b . The slope is the number multiplying x x , and the y-intercept is the standalone constant. Calling f(0) should print 7, the y-intercept, because the 4x -4x term vanishes when x=0 x = 0 .

Exercise 2: Confirm a Constant Slope

Reuse the slope(x1, y1, x2, y2) function from this lesson. Pick three different pairs of points on the line y=4x+7 y = -4x + 7 and confirm that the slope is the same every time.

def slope(x1, y1, x2, y2):
    return (y2 - y1) / (x2 - x1)

# Your code here: compute the slope for three pairs of points on y = -4x + 7

Hint

First generate points by plugging x-values into the function: for x=0 x = 0 you get y=7 y = 7 , for x=1 x = 1 you get y=3 y = 3 , for x=2 x = 2 you get y=1 y = -1 . Feed any two of these into slope. Every pair should return -4.0, because a linear function has one constant slope.

Exercise 3: Show That a Curve’s Slope Changes

For the nonlinear function y=x22 y = x^2 - 2 , use the slope function to compute the slope between x=0 x = 0 and x=1 x = 1 , and then between x=2 x = 2 and x=3 x = 3 . Compare the two results.

# Your code here: reuse slope() and the function y = x**2 - 2

Hint

Compute each y-value first: at x=0 x = 0 , y=2 y = -2 ; at x=1 x = 1 , y=1 y = -1 ; at x=2 x = 2 , y=2 y = 2 ; at x=3 x = 3 , y=7 y = 7 . The first slope is (1(2))/(10)=1 (-1 - (-2)) / (1 - 0) = 1 and the second is (72)/(32)=5 (7 - 2) / (3 - 2) = 5 . Different answers prove the slope is not constant on a curve.


Summary

You have met the most fundamental objects in the math of machine learning: functions, and the split between linear and nonlinear ones. Let’s review.

Key Concepts

Functions

  • A function is a rule that maps each input x x to exactly one output y y , written y=f(x) y = f(x)
  • Most machine learning models are functions; training means choosing the numbers inside the function so it fits the data

Linear Functions

  • Every linear function has the form y=mx+b y = mx + b and draws a straight line
  • m m is the slope (steepness) and b b is the y-intercept (where the line crosses the y-axis at x=0 x = 0 )
  • Slope is a rate of change, m=Δy/Δx m = \Delta y / \Delta x , the rise over the run
  • A linear function has one constant slope: the rate of change is identical between any two points

Nonlinear Functions

  • A function is nonlinear when x x appears with a power other than 1 (or in a root, fraction, and so on), and it draws a curve
  • A nonlinear function has a changing slope: the rate of change differs from point to point, so two points are not enough to describe it
  • Examples include y=x22 y = x^2 - 2 , y=x3 y = x^3 , and y=x y = \sqrt{x}

Computing with NumPy

  • np.linspace(start, stop, num) creates evenly spaced inputs
  • Arithmetic like 2 * x + 1 or x**2 - 2 applies to every element of an array at once
  • np.diff reveals constant steps for a line and growing steps for a curve

Why This Matters

The distinction you learned here runs through the entire course. Linear functions are the backbone of linear regression and the building blocks of larger models, and their constant slope is what makes them easy to reason about. But the world rarely moves in straight lines, and nonlinearity is what lets models bend to fit real, curved data, from house prices to neural networks.

There is also a deeper thread waiting. On a straight line you can read off the slope from any two points, but on a curve the slope changes everywhere, and a single pair of points cannot pin it down. Measuring how steep a curve is at one exact point is the central question of calculus, and answering it is what the next lessons build toward.


Next Steps

You now understand functions, slope, and the difference between straight lines and curves. The natural next question is how to talk precisely about what a function is doing as its input gets arbitrarily close to a particular value, which is the idea of a limit.

Continue to Lesson 2 - Understanding Limits

Learn how limits describe a function's behavior as the input approaches a value, even where the function is undefined.

Back to Module Overview

Return to the Math Foundations module overview.


Keep Building Your Skills

You have taken the first step into the mathematics that powers machine learning. Functions are the language every model speaks, and the linear-versus-nonlinear split you learned here will resurface in nearly every lesson that follows. Keep the two mental pictures close: a straight line with one steady slope, and a curve whose steepness shifts at every point. As the math grows, those two images will keep the new ideas grounded.