Personally, I don't feel that Gaussian elimination is the best ways to view linear transformations. It is a standard theorem that every linear mapping can be represented as a matrix under a certain basis. For simplicity, I will stick with the standard basis.
Suppose we have a linear mapping $T:\ \mathbb{F}^n \rightarrow \mathbb{F}^n$. The action of the linear mapping is entirely determined by it's action on the basis vectors. If we let
$$\mathbf{v} = c_1\mathbf{e_1} + \cdots + c_n\mathbf{e_n}$$
then correspondingly
$$T(\mathbf{v}) = c_1T(\mathbf{e_1}) + \cdots + c_nT(\mathbf{e_n})$$
so that knowing the set $\left\{T(\mathbf{e_1}),\ \cdots,\ T(\mathbf{e_n})\right\}$ is enough to determine the nature of the mapping.
Conversely, this property allows show to determine how a matrix acts as a mapping. If we have a matrix
$$A = \begin{pmatrix}\mathbf{a_1} & \cdots & \mathbf{a_n}\end{pmatrix}$$
where $\mathbf{a_i}$ are the column vectors of $A$, then the action of the mapping induced by multiplication by $A$ will be
$$A\mathbf{e_i} = \mathbf{a_i}$$
Multiplying $A$ by the $i$th standard basis vector has the effect of selecting the $i$th column of $A$. You can clearly see the action of the mapping in this way; the matrix maps the $i$th standard basis vector to it's $i$th column.
These methods give a geometric interpretation to the matrices, but it is not necessarily the most natural one. Just because we know the action of the mapping doesn't mean that we necessarily understand it. This is why the concept of diagonalization (or more generally Jordanization as tomasz mentions) is introduced. Diagonalization is the process of selecting the most geometrically natural basis for the mapping.
If the basis $\left\{\mathbf{v_i}, \cdots,\ \mathbf{v_n}\right\}$ diagonalizes $A$ to
$$D = \mathrm{diag}(\lambda_1,\ \cdots,\ \lambda_n)$$
then the action of the mapping is a dilation of factor $\lambda_i$ in the direction of the $i$th basis vector, $\mathbf{v_i}$.
I agree that the theorem and its proof are poorly written.
Given a linear map $T\colon U\to V$ (were $U$ and $V$ are finite-dimensional vector spaces with dimensions $n$ and $m$ respectively) one can represent $T$ by a matrix $T_{\alpha}^{\beta}$ with respect to chosen bases $\alpha$ of $U$ and $\beta$ of $V$. Hence, after a choice of bases, one can talk about the matrix of a linear map. (Again, I stress that this matrix representation depends on a choice of bases!)
In the theorem above and the corollary, the author refers to both the linear map and a matrix representation as $T$. So this forces a lot of confusion into the situation.
I think the following is all you need to take away from this:
Theorem: Let $T\colon U\to V$ be a linear map. Then $T$ is invertible if and only if the conditions of corollary 9.7 hold. Moreover, if $T$ is invertible then $T_{\alpha}^{\beta}$ is a non-singular matrix for all bases $\alpha$ and $\beta$.
Proof: Assume that $T$ is invertible. Clearly $\ker(T)=\{0\}$ and $\text{im}(T)=V$. By theorem 9.6 we find that $n=m$. Moreover, as $T$ is assumed to be invertible, the conditions of corollary 9.7 hold. Conversely, assume that $n=m$ and that $T$ is bijective (in other words, assume the conditions of corollary 9.7), then there is nothing to prove.
Now let $\alpha,\alpha'$ be bases of $U$ and let $\beta,\beta'$ be bases of $V$. Note that $$T_{\alpha}^{\beta}=Id_{\beta'}^{\beta}T_{\alpha'}^{\beta'}Id_{\alpha}^{\alpha'}$$
and that the matrices $Q=Id_{\beta'}^{\beta}$ and $P=Id_{\alpha}^{\alpha'}$ are non-singular. It follows that $T_{\alpha}^{\beta}$ is non-singular if and only if $T_{\alpha'}^{\beta'}$ is non-singular.$\square$
Edit: Looking at the proof the author gave, it seems he interprets (v) of corollary 9.7 as saying that $T_{\alpha}^{\beta}$ is non-singular for some bases. From that he first wants to show that this then implies that $T$ is invertible as a linear map.
Best Answer
A matrix/transformation is invertible if and only if its kernel is $\{\vec 0\}$. In other words, a matrix/transformation is invertible if and only if the only vector it sends to zero is the zero vector itself.
Now, assume that $AB$ is invertible, i.e. $ABv\ne0$ for every $v\ne0$: