Confusion about Theorem 11 Chapter 2 of Hoffman and Kunze

linear algebramatricesvector-spaces

This question has been asked before but I'm afraid I didn't follow the answer (as I mention in my (1a) way below, I'm not even sure the answer is complete), and neither OP nor the answerer are active. I thus ask again, with some extra details around my confusion. Note that HK = Hoffman and Kunze.

The theorem statement:

Theorem 11. Let $m$ and $n$ be positive integers and let $\mathbb{F}$ be a field.
Suppose $W$ is a subspace of $\mathbb{F}^n$ and $\dim(W)\leq m$. Then there is precisely one
$m\times n$ row-reduced echelon matrix over $\mathbb{F}$ which has $W$ as its row space.

Before giving the proof, I'll emphasize that I understand that this is an existence and uniqueness claim. That is, given some subspace $W$ of $\mathbb{F}^n$ (1) there exists a (2) unique $m \times n$ matrix in reduced-row echelon form (RREF). (Indeed, as we'll see by the construction, the matrices with different $m$ are trivially related by adding or removing zero rows).

Now for the proof, which comes in what feels to me like a convoluted order. I label the various "stanzas" with letters so as to refer to them later.

Proof.

Existence: Proof omitted as I follow this. I will call the matrix with the requisite properties proved constructively to exist in this step as $R'$ [see 1a below].

Uniqueness: (a) Now let $R$ be any row-reduced echelon matrix which has $W$ as its row
space. Let $\rho_1,\dots,\rho_r$ be the non-zero row vectors of $R$ and suppose that
the leading non-zero entry of $\rho_i$ occurs in column $k_i$, $i=1,\dots,r$. The
vectors $\rho_1,\dots,\rho_r$ form a basis for $W$ (Theorem 10). In the proof of Theorem 10, we
observed that if $\beta=(b_1, \dots , b_n)$ is in $W$, then the unique expression of $\beta$ as a linear combination of $\rho_1,\dots,\rho_r$ is
$$\beta=\sum_{i=1}^{r}b_{k_i}\rho_i.$$
Thus any vector $\beta$ is determined if one knows the coordinates $b_{k_i}$, $i=1,\dots,r$.

(b) Suppose $\beta\in W$ and $\beta\ne0$. We claim the first non-zero coordinate
of $\beta$ occurs in one of the columns $k_s$. Since
$$\beta=\sum_{i=1}^{r}b_{k_i}\rho_i$$
and $\beta\ne0$, we can write
$$\beta=\sum_{i=s}^{r}b_{k_i}\rho_i\;,\;b_{k_s}\ne0.$$
From the fact that $R$ is a row-reduced echelon matrix one has $R_{ij}=0$ if $i>s$ and $j\leq k_s$. Thus
$$\beta=(0,\dots,0,b_{k_s},\dots,b_n)\;,\;b_{k_s}\ne0$$
and the first non-zero coordinate
of $\beta$ occurs in one of the columns $k_s$.

(c) It is now clear that $R$ is uniquely determined by $W$. The description
of $R$ in terms of $W$ is as follows. We consider all vectors $\beta = (b_1, \dots , b_n)$
in $W$. If $\beta\ne0$, then the first non-zero coordinate of $\beta$ must occur in some
column $t$:
$$\beta=(0,\dots,0,b_t,\dots,b_n)\;,\;b_t\ne0.$$
Let $k_1, \dots , k_r$ be those positive integers $t$ such that there is some $\beta\ne0$
in $W$, the first non-zero coordinate of which occurs in column $t$. Arrange
$k_1, \dots , k_r$ in the order $k_1 < k_2 < \dots < k_r$. For each of the positive
integers $k_s$ there will be one and only one vector $\rho_s$ in $W$ such that the
$k_s$th coordinate of $\rho_s$ is 1 and the $k_i$th coordinate of $\rho_s$ is 0 for $i\ne s$. Then $R$ is the $m\times n$ matrix which has row vectors $\rho_1, \dots , \rho_r, 0, \dots , 0$.

Now it's clear that stanza (c) is where the actual uniqueness claim is proved, and that stanzas (a) and (b) are lemmas used in (c). But I can't quite follow how. In particular,

(1a) "For each of the positive
integers $k_s$ there will be one and only one vector $\rho_s$ in $W$ such that the
$k_s$th coordinate of $\rho_s$ is 1 and the $k_i$th coordinate of $\rho_s$ is 0." All we have by construction of the $k_i$ is that there is at least one vector in $W$ with a nonzero entry in $k_s$, but that doesn't say that it's zero in the other columns $k_i$. So where does the "there will be one" come from? The answer linked seems to suggest that the existence is guaranteed by the existence of $R'$ with the required properties. But why should the columns which are nonzero in the $R'$ be related a priori to the $k_i$ which we have constructed here?

(1b) Further, why should this be unique? I think (but am not sure, so can someone confirm?) that this is guaranteed by stanza (a), in that fixing the $b_{k_i}$ (in particular, we've chosen to fix $b_{k_i} = \delta_{is}$) determines the expansion coefficients of the vector in terms of the basis given by the rows of whatever $R$ (which has row space $W$) we are considering in this uniqueness proof, and basis expansion are unique.

(2) Accepting that (1a,b) have been solved, I still cannot see why "Then $R$ is the $m\times n$ matrix which has row vectors $\rho_1, \dots , \rho_r, 0, \dots , 0$." Given that we haven't used it, clearly this must follow from stanza (b). But I can't see how it does the trick for us?

Best Answer

Here is my attempt to explain why, given the existence of some matrix $R'$ in RREF that has row space $W$, any other matrix in RREF with row space $W$ is exactly $R'$. I will use the same procedure as in stanza (c) but I'll try to be more verbose about why it works. Otherwise, it is essentially the same (though reordered) argument as in HK, except that I am directly using the existence of $R'$ to hopefully clarify things:

Given that $R'$ exists, say that it has $r'$ non-zero row vectors $\rho_1,...,\rho_{r'}$, which form a basis for $W$ (by Theorem 10). Let $k'_i$ be the index of the first non-zero coordinate of $\rho_i$. By definition of RREF, $k'_i<k'_j$ for $i<j$, the $k'_i\text{th}$ coordinate of $\rho_i$ is 1, and the $k'_j\text{th}$ coordinate of $\rho_i$ is 0 for $j\neq i$. Now consider the procedure to calculate the $k_i$ as given in stanza (c). Let $K=\{k_1,...,k_{r}\}$ i.e. $K$ is the set of integers $t$ such that there exists a non-zero vector in W with its first non-zero entry being its $t\text{th}$ entry (with the $k_i$ in increasing order). Call this vector $v_i$ in correspondence with $k_i$. Since each $\rho_i$ is in $W$, we have that each $k'_i$ is in $K$. Thus $r\geq r'$. We also have $r \leq r'$ since the set of vectors $v_1,...,v_r$ must be linearly independent and the dimension of $W$ is $r'$ by virtue of $\rho_1,...\rho_{r'}$ forming a basis (I'm happy to clarify why the $v_i$ are linearly independent if needed). We thus get that $k_i=k'_i$ since the $k'_i$ are already in increasing order. (This paragraph is analogous to stanza (b)).

Now suppose $w_i\in W$ is a vector which has $k_i\text{th}$ entry 1 and $k_j\text{th}$ entry 0 for $j\neq i$. Along the same lines as stanza (a), we have that $$w_i=\sum_{j=1}^r c_j\rho_j.$$ with $c_j$ being determined by the value of the $k_j\text{th}$ entry of $w_i$. But we know the values of $w_i$ for each $k_j$ entry, which gives us $c_i=1$ and $c_j=0$ for $j\neq i$. Thus $w_i=\rho_i$.

Let $R$ be a matrix in RREF with row space $W$. By definition of RREF, each of its row vectors is a bunch of 0s followed by a 1. By the previous work, this 1 must occur on some column $k_i$. By definition of RREF, the ordering of the row vectors is the same as the ordering for when the first non-zero entry appears in each vector. Thus we must have that the $i\text{th}$ row vector in $R$ has a 1 occurring in column $k_i$ with zeroes in the columns before $k_i$. Again by definition of RREF, the $i\text{th}$ row vector will be 0 in the $k_j$ column for $j\neq i$. Thus the $i\text{th}$ row vector in $R$ must satisfy the same properties as $w_i$ above, which we've shown is only possible for $\rho_i$. Thus the row vectors of $R$ are the same as $R'$ and so $R=R'$, proving uniqueness.

Related Question