[Math] Understanding the significance of row space and column space basis

linear algebramatricesvector-spaces

I've just learned about the row and column space basis and I'm confused about what the significance of each is. My professor basically hasn't said much and has danced around any direct questions on how these things relate.

So what I know is this:

The column space basis is solved by taking a spanning set of vectors and forming matrix A, doing RREF(A), looking at the non-zero columns (linearly independent columns) and then relating that back to the original matrix A. The corresponding columns in the original matrix A form the column basis for A.

This makes sense because you can see that any column in that matrix can be formed from some linear combination of the basis.

Now when it comes to row space I can mechanically do it. Transpose A, reduce it to echelon form (not RREF), transpose it again, and read off the columns. In this case, you do not go back to the original matrix.

My questions are: why don't we go back to the original matrix? Why do we only reduce it to echelon form and then read off the rows? How does this basis relate to the rows of the original matrix A?

Sorry if these are elementary question. I feel like I can mechanically (and begrudgingly) do it but I really want to understand and appreciate it.

Thank you!

Best Answer

Your procedures are rather non-standard in terms of finding basis for columnspaces and rowspaces. In particular your second procedure for finding the rowspace is not only unorthodox, but appears to be incorrect. To fully answer all of your questions takes a bit of time, and I apologize in advance for the length. The fact is, you can find both basis in a single shot by reducing the original matrix, say $A$, to its RREF, say $R$.

First of all, the RREF is fundamentally defined in terms of the rows. That's why it is the Reduced Row Echelon Form. Elementary row operations by design do not change the rowspace; each new row remains a linear combination of the old rows. That means you can row reduce your matrix all you want, but your rowspace remains the same. It follows that $R$ and $A$ share a rowspace.

But the rows of $R$ are linearly independent: each row has an entry, its leading pivot entry, that no other row has. The rows of $R$ clearly also span the rowspace of $R$, just by definition, and therefore it is a basis for the rowspace of $R$. Now since $\mathrm{row}(R) = \mathrm{row}(A)$, it follows that the rows of $R$ also form a basis for the rowspace of $A$. There is no need to go back to the original matrix because the rowspace has not changed between elementary row operations.

Finding a basis for the columnspace is a bit more complicated because elementary row operations does not preserve the columnspace. The columns of $R$ do not necessarily span the same space as the columns of $A$. Of course, the columnspace of $A$ is just the rowspace of $A^\mathrm{T}$ so we can always apply the previous procedure to $A^\mathrm{T}$. I believe this is what your second procedure tries to do, but it doesn't get it quite right. More on that later. In any case, reducing two matrices is hard work and luckily we can actually kill two birds with one stone.

If we encode the sequence of elementary row operations which take $A$ to $R$ as elementary matrices, then we can write $$E_k\cdots E_1 A = R$$ where each $E_i$ is an elementary matrix. Let us just collectively write $$EA = R$$ where $E$ is the product of all our elementary matrices. The important thing here is that $E$ is an invertible matrix since each $E_i$ is invertible. This fact actually characterizes row equivalent matrices: two (same sized) matrices $A$ and $B$ are row equivalent if and only if there exists an invertible matrix $E$ such that $A=EB$.

Let us write $\mathbf{a}_i$ as the $i$th column of $A$ and $\mathbf{r}_i$ as the $i$th column of $R$. By block matrix multiplication, it follows that $$E\mathbf{a}_i = \mathbf{r}_i$$ My claim now is that linear relations are preserved between the columns of $A$ and $R$, i.e. for any set of scalar coefficients $\{c_i\}$, we have $$\mathbf{0} = \sum_{i=1}^n c_i\mathbf{a}_i \iff \mathbf{0}=\sum_{i=1}^n c_i \mathbf{r}_i$$ The proof is simple: $$\mathbf{0} = \sum_{i=1}^n c_i\mathbf{r}_i \iff \mathbf{0}=\sum_{i=1}^n c_iE\mathbf{a_i} \iff \mathbf{0}= E\left(\sum_{i=1}^n c_i\mathbf{a}_i\right) \iff \mathbf{0} = \sum_{i=1}^n c_i\mathbf{a}_i$$ The last line follows because $E$ is invertible and therefore its nullspace is just $\{\mathbf{0}\}$.

The above relations are very powerful and we have some immediate corollaries:

Let $\mathcal{A}=\{\mathbf{a}_{i_k}\}$ be a subset of $\{\mathbf{a}_i\}$ and let $\mathcal{R} = \{\mathbf{r}_{i_k}\}$ be the corresponding subset of $\{\mathbf{r}_i\}$. Then

  1. $\mathcal{A}$ is linearly independent if and only if $\mathcal{R}$ is linearly independent.

  2. $\mathcal{A}$ spans $\mathrm{col}(A)$ if and only if $\mathcal{R}$ spans $\mathrm{col}(R)$.

  3. The two points above imply that $\mathcal{A}$ forms a basis for $\mathrm{col}(A)$ if and only if $\mathcal{R}$ forms a basis for $\mathrm{col}(R)$.

I will not prove the above propositions, the answer is getting a bit long and I think it would make a good exercise. I encourage you to attempt a proof yourself. The implications of the corollaries are straightforward: if you are given a basis for the columnspace of $R$, taking the corresponding vectors will give a basis for $A$. Taking the pivot columns of $R$ serves as an immediate basis for $\mathrm{col}(R)$ and the corresponding columns of $A$ serves as a basis for $\mathrm{col}(A)$. This is the underlying theory behind your first procedure.

To reiterate, take care to note that the pivot columns of $R$ do not form a basis for $\mathrm{col}(A)$ themselves, only the corresponding columns of $A$ do. Elementary row operations are designed to preserve the rowspace and not the columnspace. Oftentimes, $\mathrm{col}(A) \neq \mathrm{col}(R)$ and it makes no sense to take columns of $R$ to act as a basis for the columnspace of $A$. This is the fundamental reason why one of the procedures need to go back to the original matrix while the other one does not.

This now enables you to efficiently find the basis for the rowspace and columnspace all in one go. And we've explained how your first procedure works. With the benefit of hindsight, you should be able to see why your second procedure is a bit weird. You're row reducing the transpose, which means that you are essentially preforming elementary column operations on your original matrix. These operations do not preserve the rowspace and so the procedure as you've described it cannot be correct. At best, you are just preforming the procedure for finding the columnspace basis on $A^\mathrm{T}$, but that still requires you to go back to your original matrix. Even if your second procedure were correct, it turns out to be largely redundant as we've shown that simply reducing $A$ to its RREF $R$ is sufficient to find a basis for the columnspace and the rowspace (and the nullspace if you want).