Finding a basis for a set of vectors-reasoning behind reduced row echelon form

linear algebra

I had a question about finding the basis for a set of vectors. so one way to do it, is just to write it in a matrix form, and convert it to reduced row-echelon form correct? And the pivot columns in reduced row echelon form, and the corresponding columns in the original matrix gives us the basis for that set of vector.

I don't quite see how this works. Why should this be true? Can someone please help me?

Best Answer

Linear independence has a few equivalent formulations. The one that most resembles the matrix version is this one:

A list $(v_1, \ldots, v_n)$ is linearly dependent if and only if there exists some $v_i$ such that $$v_i \in \operatorname{span}(v_1, \ldots, v_{i-1}).$$

In other words, at some point when adding the vectors in the list, you've added one that doesn't expand the span at all. To prove this, consider the least $i$ such that $(v_1, \ldots, v_i)$ is linearly dependent (there must be one, since the whole list is linearly dependent, and the well-ordering property of $\Bbb{N}$). Then, for $a_1, \ldots, a_i$, we have $$a_1 v_1 + \ldots + a_i v_i = 0 \tag{1}$$ for some scalars $a_1, \ldots, a_i$, not all of which are equal to $0$.

If $a_i = 0$, then $(1)$ reduces to $$a_1 v_1 + \ldots + a_{i-1} v_i = 0$$ where $a_1, \ldots, a_{i-1}$ are not all $0$. This implies $(v_1, \ldots, v_{i-1})$ is linearly dependent, against the definition of $i$. Therefore, by contradiction, $a_i \neq 0$. Hence, $(1)$ becomes $$v_i = -\frac{a_1}{a_i} v_1 - \ldots - \frac{a_{i-1}}{a_i}v_i \in \operatorname{span}(v_1, \ldots, v_{i-1}),$$ as required.

Conversely, if $v_i \in \operatorname{span}(v_1, \ldots, v_{i-1})$, then $$v_i = a_1 v_1 + \ldots + a_{i-1}v_{i-1}$$ for some $a_1, \ldots, a_{i-1}$, and hence $$a_1 v_1 + \ldots + a_{i-1}v_{i-1} + (-1)v_i + 0v_{i+1} + \ldots + 0v_n = 0$$ is a non-trivial null linear combination of $(v_1, \ldots, v_n)$, making the list linearly dependent.


How does this help us? Think about the procedure for figuring out if a column vector $w$ belongs to the span of a list of column vectors $(v_1, \ldots, v_n)$. The first thing to do is set up an equation to solve: $$w = a_1 v_1 + \ldots + a_n v_n.$$ This turns into a system of linear equations in the variables $a_1, \ldots, a_n$, which we can turn into an augmented matrix $$\left[\begin{array}{ccc|c}&&&\\v_1&\dots&v_n&w\\&&&\end{array}\right]. \tag{2}$$ We then reduce this down to a row-echelon form, and examine the $0$ rows. If every $0$ row extends to the augmented column, then $w$ belongs to the span. If one of the zero rows has a non-zero entry in the augmented column, then $w$ does not belong to the span.

However, think about this: there's no real difference between an augmented column and a regular column! Reducing the matrix from $(2)$ to row-echelon form requires exactly the same procedure as reducing $$\left[\begin{array}{cccc}&&&\\v_1&\dots&v_n&w\\&&&\end{array}\right].$$ If you get a pivot in the final column (corresponding to $w$), then this is equivalent to having one of your zero rows from the row-echelon form of $(2)$ having a non-zero entry in the augmented column. So, a pivot in the final column implies that it does not belong to the span of the previous vectors.

Finally, we should note that this doesn't just hold for the vector in the final column. Think about how we actually go about, routinely, computing a row-echelon form of a matrix. We first try to secure a pivot in the top of the leftmost column (provided the column is not just $0$), and then proceed to try to eliminate every entry below this pivot. We then proceed to the next column, leaving the first row alone. We then repeat this process to get a pivot as leftmost as possible in the second row, and continue working our way to the right.

In essence, we try to shore up columns from left to right. The process of shoring up the leftmost column is precisely the full process for turning just that column vector into row-echelon form. The process for shoring up the two leftmost columns is precisely the full process for turning the submatrix containing just those two columns into row-echelon form.

Let's briefly illustrate this:

\begin{align*} \begin{bmatrix}1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9\end{bmatrix} &\sim \begin{bmatrix}1 & 2 & 3 \\ 0 & -3 & -6 \\ 7 & 8 & 9\end{bmatrix} \\ &\sim \begin{bmatrix}\color{red}{1} & 2 & 3 \\ \color{red}{0} & -3 & -6 \\ \color{red}{0} & -6 & -12\end{bmatrix} \\ &\sim \begin{bmatrix}1 & 2 & 3 \\ 0 & 1 & 2 \\ 0 & -6 & -12\end{bmatrix} \\ &\sim \begin{bmatrix}\color{red}1 & \color{red}2 & 3 \\ \color{red}0 & \color{red}1 & 2 \\ \color{red}0 & \color{red}0 & 0\end{bmatrix}. \end{align*} The highlights show the submatrices turning into row-echelon form. The first highlight shows the row-echelon form of the submatrix $$\begin{bmatrix}1 \\ 4 \\ 7\end{bmatrix}.$$ The second highlight shows the row-echelon form of the submatrix $$\begin{bmatrix} 1 & 2 \\ 4 & 5 \\ 7 & 8\end{bmatrix}.$$ This is always true when reducing a matrix: you always reduce more and more columns, until the whole thing is in row-echelon form.

So, what this means is, when you row reduce a matrix $$\begin{bmatrix}&&\\v_1&\dots&v_n\\&&\end{bmatrix},$$ you are simultaneously checking if $v_2 \in \operatorname{span}(v_1)$ and if $v_3 \in \operatorname{span}(v_1, v_2)$, and if $v_4 \in \operatorname{span}(v_1, v_2, v_3)$, etc. There will be a pivot in the $i$th column if and only if $v_i \notin \operatorname{span}(v_1, \ldots, v_{i-1})$. That is, there will be a pivot in the $i$th column if and only if $v_i$ adds something to the span that no previous vectors contribute. Discarding all vectors without pivots will therefore not damage the span, and what you're left with is a list of vectors for which do not belong to the span of the previous vectors, and hence are linearly independent.