Understanding a Proof for “A Square Matrix is Invertible If its Rows Are Linearly Independent”

linear algebramatrices

In Eric Lengyel's book, Mathematics for 3D Game Programming and Computer Graphics, 3rd Edition, there is a theorem that states

An $n \times n$ matrix M is invertible if and only if the rows of M form a linearly independent set of vectors.

Two proofs were provided, each corresponding to if and only if of the theorem. I understand the first proof, but I don't fully understand the second one. I have some understanding of it, but it's not enough to fully grasp the proof. The proof, slightly modified for brevity, goes as follows:

Let the rows of M be denoted by $R_{1}^{T}, R_{2}^{T}, \ldots, R_{n}^{T}$. Now assume that the rows of M are a linearly independent set of vectors. We first observe that performing elementary row operations on a matrix does not alter the property of linear independence within the rows. Running through Algorithm 3.12, if step C fails, then rows j through n of the matrix at that point form a linearly dependent set since the number of columns for which the rows $R_{j}^{T}$ through $R_{n}^{T}$ have at least one nonzero entry is less than the number of rows itself. This is a contradiction, so step C of the algorithm cannot fail, and M must be invertible.

Note: A screenshot from the book of Algorithm 3.12 is available at the bottom.

What I don't fully understand is this snippet of the proof:

[…] rows j through n of the matrix at that point form a linearly dependent set since the number of columns for which the rows $R_{j}^{T}$ through $R_{n}^{T}$ have at least one nonzero entry is less than the number of rows itself. […]

Why do the rows form a linearly dependent set based on the idea that "the number of columns for which the rows $R_{j}^{T}$ through $R_{n}^{T}$ have at least one nonzero entry is less than the number of rows itself."?


Appendix

Algorithm 3.12

Description of Algorithm 3.12

Best Answer

The proof appears to be using the fact that linearly independent subsets of $\Bbb{R}^n$ have at most $n$ vectors in them. For the sake of illustration, let's say we're at step C with $j = 3$ in an $n \times n$ matrix. Then our matrix looks something like this:

$$\begin{pmatrix} 1 & 0 & * & * & \cdots & * \\ 0 & 1 & * & * & \cdots & * \\ 0 & 0 & \color{red}* & * & \cdots & * \\ 0 & 0 & \color{red}* & * & \cdots & * \\ \vdots & \vdots & \color{red}\vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \color{red}* & * & \cdots & * \end{pmatrix}.$$

The proof then tries to find the largest absolute value out of the red $\color{red}*$s, throwing an exception if every one is $0$. If every one of the $\color{red}*$s was $0$, then the last $n - 2$ rows each begins with three $0$s, and the $n - 2$ row vectors can be embedded in $\Bbb{R}^{n - 3}$ (by ignoring the first three $0$s). There are one too many such vectors for them to be linearly independent, hence the theorem (for the $j = 3$ case anyway; hopefully you can see how this would generalise).