[Math] the intuitive meaning of pivot entries in RREF

gaussian eliminationlinear algebralinear-transformationsmatricesvector-spaces

I never intuitively understood the point of pivot entries in reduced row echelon form. For example, they can be used to create a basis from a set of column vectors. Maybe you can point out the purpose of pivots with the help of the following example.

Say I have a matrix $A$ in reduced row echelon form and set it equal to $\mathbf{0}$ as $Ax = \mathbf{0}$ $$
\begin{bmatrix}
1 & 1 & 0 & 7 & -2 \\
0 & 0 & 1 & -2& 2
\end{bmatrix}
\begin{bmatrix}
x_1\\
x_2\\
x_3\\
x_4\\
x_5
\end{bmatrix}
= \mathbf{0}
$$

This says the same as

$$x_1 + x_2 + 7x_4 -2x_5 = 0$$
$$x_3 -2x_4 + 2x_5 = 0$$
which says the same as
$$
\begin{equation}
x_1 = – x_2 -7x_4 + 2x_5
\end{equation}\tag{1}
$$

$$
\begin{equation}
x_2 = -x_1 -7x_4 +2x_5
\end{equation}\tag{2}
$$

$$
\begin{equation}
x_3 = 2x_4 -2x_5
\end{equation}\tag{3}
$$

$$
\begin{equation}
x_4 = \frac{x_3}{2} + x_5
\end{equation}\tag{4}
$$

$$
\begin{equation}
x_5 = \frac{-x_3}{2} + x_4
\end{equation}\tag{5}
$$

Now, because column 1 and column 3 of $\operatorname{rref}(A)$ are pivot columns you should reexpress equations $1$ and $3$ in terms of the non pivot variables. I have no idea why though:

$$
\begin{bmatrix}
x_1\\
x_2\\
x_3\\
x_4\\
x_5
\end{bmatrix} =
x_2
\begin{bmatrix}
-1\\
1\\
0\\
0\\
0
\end{bmatrix}
+
x_4
\begin{bmatrix}
-7\\
0\\
2\\
1\\
0
\end{bmatrix}
+
x_5
\begin{bmatrix}
2\\
0\\
-2\\
0\\
1
\end{bmatrix}
$$

Okay, but I don't see why you cannot pick any other two equations… Say $2$ and $3$:

$$
\begin{bmatrix}
x_1\\
x_2\\
x_3\\
x_4\\
x_5
\end{bmatrix} =
x_1
\begin{bmatrix}
1\\
-1\\
0\\
0\\
0
\end{bmatrix}
+
x_4
\begin{bmatrix}
0\\
-7\\
2\\
1\\
0
\end{bmatrix}
+
x_5
\begin{bmatrix}
0\\
2\\
-2\\
0\\
1
\end{bmatrix}
$$

Both choices of vectors are linearly independent and thus form a basis for $\operatorname{N}(A)$.

So, what is so special about the pivot columns… I mean the fact that pivots are the only non-zero entry in their column, means that reexpressing a variable $x_i$ corresponding to a pivot in column $i$ does not involve any other variable $x_j$ corresponding to a pivot in column $j$, as the coefficient for $x_j$ is $0$ in the expression for $x_i$. I can see that. However, that can also be true for non pivot entries, such as for the second column of $\operatorname{rref}(A)$. So, what is the point of pivots exactly?

edit

Another example. If I choose equations $2$ and $4$ I get

$$
\begin{bmatrix}
x_1\\
x_2\\
x_3\\
x_4\\
x_5
\end{bmatrix} =
x_1
\begin{bmatrix}
1\\
-1\\
0\\
0\\
0
\end{bmatrix}
+
x_3
\begin{bmatrix}
0\\
0\\
1\\
\frac{1}{2}\\
0
\end{bmatrix}
+
x_4
\begin{bmatrix}
0\\
-7\\
0\\
0\\
0
\end{bmatrix}
+
x_5
\begin{bmatrix}
0\\
2\\
0\\
1\\
1
\end{bmatrix}
$$

But I must be missing something. This cannot be correct. I get 4 linearly independent vectors as a basis of $\operatorname{N}(A)$, but all bases constructed from the columns of a given matrix have the same cardinality. Maybe this error, whatever it is, exposes where I am having a misconception…?

Best Answer

Gaussian elimination to RREF is one particular way to solve a system of linear equations into a parametric form for the solution set.

The parametric form you get from the book method is not really in a meaningful way "better" than any other form you could choose -- such as your example. In the vast majority of application each of them will be equally good.

The only real reason why the textbook method ends up with this result rather than the other equivalent one is that it is relatively easy to describe a method that leads to this result deterministically, without any need to make choices or be clever along the way. There are several contexts where knowing that such a general method exists is valuable, even if you don't care particularly which of the many valid forms of the solutions it singles in on.

In particular there is no real geometric significance of the columns the method ends up selecting as pivots. Usually there will be many other possibilities. The Gaussian elimination will, in some sense, find the leftmost possible components to compute from the other ones, and you can sometimes use this property to get some control over which are picked -- namely express your problem before you start calculating such that the variables you would most like to derive from others are farther to the left. In most cases you won't even care, though.

(TL;DR: You're overthinking it).

Related Question