Lesson 7 - Solution Sets and Linear Independence
Welcome to the Capstone of the Math Module
This lesson ties together everything you have learned about vectors, matrices, and linear systems. You will learn what it means for a set of vectors to be linearly independent, how that connects to the span of those vectors and to the solution set of a linear system, and how to measure all of this in a single number called rank. By the end, you will see exactly why a set of independent vectors can fill an entire plane while a set of dependent vectors collapses onto a single line.
By the end of this lesson, you will be able to:
- Explain what linear independence and linear dependence mean, in plain words and with vectors
- Describe the span of a set of vectors and connect it to solution sets of a linear system
- Compute the rank of a matrix with NumPy and interpret what it tells you
- Recognize when a system has no solution, one solution, or infinitely many solutions
- Connect rank, independence, and redundancy to real machine learning problems
You should be comfortable with vectors, the dot product, matrix multiplication, and solving small linear systems, all covered in the earlier lessons of this module. We only need NumPy here. Let’s finish strong.
The Question Behind Everything
Every linear system you have solved so far has quietly been asking one question: can I build the target vector out of the columns of ?
When you write a system as , you are not just shuffling numbers. You are asking whether some combination of the columns of , weighted by the entries of , adds up to . That single reframing is the key to this entire lesson.
Consider a tiny system:
$$ \begin{bmatrix} 1 & 2 \ 3 & 4 \end{bmatrix} \begin{bmatrix} x_1 \ x_2 \end{bmatrix}
\begin{bmatrix} 5 \ 11 \end{bmatrix} $$
This is the same as asking: what amounts and of the two column vectors and do you need to reach ?
A weighted sum of vectors like this is called a linear combination. Solving a linear system is finding the right linear combination. Whether a solution exists, and whether it is unique, depends entirely on the relationship between those column vectors. That relationship has a name: linear independence.
Linear Combinations and Span
Before independence, you need one more idea: the span.
The span of a set of vectors is every point you can reach by taking linear combinations of them. Pick any weights you like, positive or negative, large or small, add up the scaled vectors, and the result lands somewhere. Collect all those possible landing spots and you have the span.
Start with a single vector in the plane, say . Scaling it by every possible number sweeps out a straight line through the origin. So the span of one (nonzero) vector is a line.
Now add a second vector that points in a genuinely different direction. By mixing the two, you can now reach any point in the plane. The span of these two vectors is the entire plane .
But watch what happens if the second vector is just a stretched copy of the first, like , which is exactly . Adding it gives you nothing new: any combination of and still lands on the same line. The span has not grown.
Span is about coverage
Think of each vector as a direction you are allowed to walk in. Span asks: with these directions, what part of space can you ever reach? Two truly different directions in 2D let you reach everywhere. A direction and a copy of it leave you stuck on one line, no matter how cleverly you combine them.
That difference, between vectors that open up new directions and vectors that do not, is precisely linear independence.
Linear Independence vs. Dependence
A set of vectors is linearly independent when no vector in the set can be written as a linear combination of the others. Each one contributes a direction that the others cannot reproduce. Nothing is redundant.
A set is linearly dependent when at least one vector can be built from the others. That vector is redundant: it adds no new direction, so removing it would not shrink the span.
In two dimensions the picture is clean:
- Two vectors are independent when they point in different directions. Together they span the whole plane.
- Two vectors are dependent when one is a scalar multiple of the other. They lie on the same line, and together they span only that line.
The figure below shows both cases side by side.
Here is the formal definition, which generalizes cleanly to any number of dimensions. Vectors are linearly independent if the only way to make
is to set every weight . If you can satisfy that equation with some weight that is not zero, the vectors are dependent, because you could then solve for one vector in terms of the others.
A quick mental test
For just two vectors, the check is fast: is one a scalar multiple of the other? If yes, they are dependent. and are dependent because the second is exactly twice the first. and are independent because no single number turns one into the other.
Measuring Independence with Rank
Eyeballing two vectors is easy. With ten vectors in ten dimensions it is hopeless. You need a number that summarizes how many genuinely independent directions a set of vectors contains. That number is the rank.
The rank of a matrix is the number of linearly independent columns it has (which, conveniently, always equals the number of linearly independent rows). Equivalently, it is the dimension of the span of the columns. Stack your vectors as the columns of a matrix and the rank tells you how much space they actually fill.
NumPy computes it directly with np.linalg.matrix_rank. Let’s start with an independent set.
import numpy as np
# Two independent vectors, stacked as the columns of a matrix
independent = np.array([
[2, 1],
[1, 3],
])
rank_independent = np.linalg.matrix_rank(independent)
print("Rank of independent set:", rank_independent)
# Output: Rank of independent set: 2A rank of 2 for a set of 2D vectors means both directions are genuinely distinct. Two independent directions span a two-dimensional space: the full plane. There is no redundancy.
Now the dependent case, where the second column is exactly twice the first.
# Two dependent vectors: the second column is 2x the first
dependent = np.array([
[1, 2],
[2, 4],
])
rank_dependent = np.linalg.matrix_rank(dependent)
print("Rank of dependent set:", rank_dependent)
# Output: Rank of dependent set: 1The rank is 1, not 2, even though there are two columns. NumPy is telling you that only one independent direction is present. The second column added nothing new. These vectors span a one-dimensional space: a single line through the origin.
This is the computational heart of the whole lesson:
- Rank 2 for two 2D vectors means independent: they span the plane.
- Rank 1 for two 2D vectors means dependent: they span only a line.
When the rank equals the number of vectors, the set is independent. When the rank is smaller, some vectors are redundant.
Why rank, not the determinant
For a square matrix you could check independence with the determinant: a zero determinant means dependent columns. Rank is more general because it works for non-square matrices too, and it tells you how many independent directions exist, not just whether all of them are. That extra information is exactly what you need when matrices are tall, wide, or close to singular.
Connecting rank back to determinant
In an earlier lesson you saw that a square matrix is singular when its determinant is zero, meaning it has no inverse. That is the same phenomenon viewed from a different angle. A singular matrix has dependent columns, so its rank is less than its size. Let’s confirm it on the dependent matrix above, whose determinant is .
print("Determinant:", np.linalg.det(dependent))
# Output: Determinant: 0.0
print("Rank:", np.linalg.matrix_rank(dependent)) # less than 2 -> singular
# Output: Rank: 1A determinant of 0 and a rank below full size are two ways of saying the same thing: the columns are linearly dependent, so the transformation collapses the plane onto a line and cannot be undone.
How Independence Decides the Solution Set
Now you can answer the question this lesson opened with. The solution set of is the collection of every vector that satisfies the equation. Independence of the columns of controls which of three situations you land in.
One solution. When the columns of are independent and lies in their span, there is exactly one combination of columns that reaches . The solution set is a single point.
Infinitely many solutions. When the columns are dependent but still lies in their span, there is more than one way to build , because the redundant direction lets you trade weight around freely. The solution set is a whole line or plane of vectors. The extra freedom shows up as a free variable, a coordinate you can set to any value while the rest adjust to compensate.
No solution. When lies outside the span of the columns, no combination can ever reach it. The solution set is empty, and the system is called inconsistent.
A picture of the dependent case makes “no solution” intuitive. Two equations whose coefficient vectors are multiples of each other plot as two parallel lines. If the constants disagree, the lines never meet and there is no solution.
$$ \begin{bmatrix} 8 & 4 \ 4 & 2 \end{bmatrix} \begin{bmatrix} x_1 \ x_2 \end{bmatrix}
\begin{bmatrix} 5 \ 5 \end{bmatrix} $$
The coefficient matrix here has dependent rows (the first is twice the second), so its rank is 1, not 2. Let’s see what NumPy says and confirm the system is inconsistent.
A = np.array([
[8, 4],
[4, 2],
])
print("Rank of A:", np.linalg.matrix_rank(A))
# Output: Rank of A: 1
# Augment A with the target b to compare ranks
b = np.array([[5], [5]])
augmented = np.hstack([A, b])
print("Rank of [A | b]:", np.linalg.matrix_rank(augmented))
# Output: Rank of [A | b]: 2The coefficient matrix has rank 1, but gluing the target on as an extra column raises the rank to 2. That mismatch is the signal of an inconsistent system: points in a direction the columns of cannot reach, so no solution exists. When the rank of and the rank of the augmented matrix agree, a solution exists; when they differ, it does not.
Dependent does not automatically mean no solution
Dependence in the columns of means you have lost a direction, but whether the system has infinitely many solutions or none depends on . If lies in the (smaller) span, you get infinitely many solutions; if it lies outside, you get none. Always compare the rank of with the rank of the augmented matrix to tell the two apart.
Homogeneous Systems: the One Case That Always Has a Solution
There is one special family worth naming. A system is homogeneous when the target is the zero vector:
A homogeneous system can never be inconsistent, because setting always works: scaling every column by zero gives the zero vector. That guaranteed answer is called the trivial solution.
The interesting question for a homogeneous system is whether other solutions exist beyond the trivial one. And independence answers it cleanly:
- If the columns of are independent (full rank), the trivial solution is the only solution.
- If the columns are dependent (rank below full), there are infinitely many solutions, forming a solution space through the origin.
# Independent columns -> only the trivial solution x = 0
A_indep = np.array([
[2, 1],
[1, 3],
])
print("Rank:", np.linalg.matrix_rank(A_indep), "of 2 columns -> only x = 0")
# Output: Rank: 2 of 2 columns -> only x = 0
# Dependent columns -> a whole line of solutions through the origin
A_dep = np.array([
[1, 2],
[2, 4],
])
print("Rank:", np.linalg.matrix_rank(A_dep), "of 2 columns -> infinitely many solutions")
# Output: Rank: 1 of 2 columns -> infinitely many solutionsFor the dependent matrix, every vector along a particular line satisfies . That line of extra solutions exists precisely because the redundant column left a direction free to vary.
Why This Matters for Machine Learning
This is not abstract bookkeeping. Linear independence and rank sit underneath a surprising amount of practical machine learning.
Redundant features. Imagine a dataset with a column for height in centimeters and another for height in inches. One is a scalar multiple of the other, so as feature vectors they are linearly dependent. They contribute one direction of information, not two. Worse, a feature that is a combination of others (say total = price + tax) is dependent on them too. These redundant columns lower the rank of your feature matrix without adding information.
Solvability of linear regression. Linear regression solves a system that involves , where is your feature matrix. If two features are perfectly dependent, that matrix is singular (rank-deficient) and cannot be inverted, so the standard solution does not exist. This failure mode is called perfect multicollinearity, and it is exactly the “rank below full” situation from this lesson, dressed in statistics vocabulary.
Numerical stability. Even features that are nearly dependent, pointing in almost the same direction, push a matrix close to singular. The math technically works, but the solution becomes wildly sensitive to tiny changes in the data, producing unstable, untrustworthy coefficients. Checking rank, or watching for near-zero quantities, warns you before this bites.
# A feature matrix where column 3 = column 1 + column 2 (redundant!)
X = np.array([
[1.0, 2.0, 3.0],
[4.0, 1.0, 5.0],
[2.0, 5.0, 7.0],
[3.0, 3.0, 6.0],
])
print("Columns:", X.shape[1])
print("Rank: ", np.linalg.matrix_rank(X))
# Output:
# Columns: 3
# Rank: 2Three columns but a rank of 2: NumPy has detected that one feature is redundant. In practice you would drop a column, combine the duplicates, or use a technique like regularization that tolerates dependence. Either way, the rank is what told you the problem was there.
Rank is a sanity check
Before training a linear model, comparing the rank of your feature matrix to its number of columns is a cheap, powerful sanity check. If the rank is smaller, you have redundant or perfectly correlated features that will destabilize the model. Catching that early saves hours of confused debugging later.
Practice Exercises
Now it is your turn. Try each one before opening the hint.
Exercise 1: Test a Pair of Vectors
Use NumPy to determine whether the vectors and are linearly independent or dependent. Stack them as columns of a matrix, compute the rank, and print whether they span the plane or only a line.
import numpy as np
# Your code hereHint
Build the matrix with np.array([[3, 1], [6, 2]]) so the vectors are columns, then call np.linalg.matrix_rank(...). Notice that the first vector is exactly 3 times the second, so expect a rank of 1, meaning dependent and spanning only a line.
Exercise 2: Find the Redundant Feature
You are given a feature matrix where one column is a combination of the others. Compute its rank, compare that to the number of columns, and print how many redundant directions there are.
import numpy as np
X = np.array([
[1.0, 0.0, 2.0],
[0.0, 1.0, 1.0],
[2.0, 1.0, 5.0],
[1.0, 1.0, 3.0],
])
# Your code hereHint
Compute rank = np.linalg.matrix_rank(X) and compare it to X.shape[1]. The number of redundant directions is X.shape[1] - rank. The third column equals twice the first plus the second, so the rank is 2 and there is 1 redundant direction.
Exercise 3: Consistent or Inconsistent?
For the system below, decide whether a solution exists by comparing the rank of the coefficient matrix to the rank of the augmented matrix .
import numpy as np
A = np.array([
[1, 2],
[2, 4],
])
b = np.array([[3], [7]])
# Your code hereHint
Build the augmented matrix with np.hstack([A, b]), then compare np.linalg.matrix_rank(A) to np.linalg.matrix_rank(augmented). Here A has rank 1, but the augmented matrix has rank 2, so the ranks differ and the system is inconsistent: no solution exists.
Summary
Congratulations! You have reached the capstone of the math foundations module and connected vectors, matrices, and linear systems into one coherent picture. Let’s review.
Key Concepts
Linear Combinations and Span
- A linear combination is a weighted sum of vectors; solving means finding the combination of columns that builds
- The span of a set of vectors is every point reachable by their linear combinations
- One nonzero vector spans a line; two independent vectors span the whole plane
Linear Independence
- Vectors are independent when none can be built from the others; every one adds a new direction
- Vectors are dependent when one is a combination (in 2D, a scalar multiple) of the others; one is redundant
- Formally, independence means only when all
Rank
- Rank is the number of independent directions in a matrix, computed with
np.linalg.matrix_rank - Independent 2D vectors give rank 2 (span the plane); dependent ones give rank 1 (span a line)
- Full rank (rank equals number of columns) means independent; lower rank means redundancy
- A zero determinant and a rank below full size both signal dependent columns
Solution Sets
- The solution set holds every satisfying the system; it can be empty, a point, or infinite
- Independent columns with in their span give exactly one solution
- Dependent columns with in their span give infinitely many (free variables appear)
- When is outside the span, the system is inconsistent with no solution
- Compare
rank(A)withrank([A | b]): equal means a solution exists, different means none does - Homogeneous systems () always have the trivial solution; dependence adds infinitely many more
Why This Matters
Linear independence is the quiet idea behind whether a model can even be fit. Linear regression breaks down when features are perfectly dependent, because the matrix it must invert becomes singular, exactly the rank-deficient case you computed here. Features that merely overlap heavily push that matrix toward singularity and make a model’s coefficients unstable and untrustworthy. This is multicollinearity, and now you understand it from the ground up: it is just dependent columns wearing a statistics name.
More broadly, rank gives you a single number that answers a deep question about your data: how much genuinely distinct information is in it? A feature matrix with ten columns but a rank of seven is carrying only seven directions of real information. Recognizing that lets you drop redundancy, stabilize models, and reason clearly about what your data can and cannot support. That habit of looking past the surface to the underlying structure is what the entire math foundations module has been preparing you for.
Next Steps
You have completed the math foundations of the program. With functions, limits, derivatives, gradient descent, linear systems, vectors, matrices, and now independence and rank under your belt, you have the mathematical vocabulary that every machine learning algorithm is built from.
Continue to the Next Module - Machine Learning Foundations
With the math in hand, start the hands-on machine learning workflow: features, models, and evaluation with scikit-learn.
Back to Module Overview
Return to the Math Foundations module overview to review any lesson.
Keep Building Your Skills
You have finished the mathematical backbone of this program. The ideas you practiced here, span, independence, rank, and solution sets, are not isolated facts to memorize; they are the lens through which experienced practitioners read their data and their models. Every time you wonder whether a system can be solved, whether a feature is redundant, or why a model refuses to converge, you will reach for the same question this lesson centered on: how many independent directions are really here? Carry that question forward, and the more advanced algorithms ahead will feel far less like magic and far more like familiar mathematics doing its job.