When is row equivalence equivalent to the equivalence of systems of equations

linear algebramatricesmatrix equationssystems of equations

Theorem 3 (Chapter 1) in Hoffman and Kunze's Linear Algebra, 2nd edition (HK) says

Theorem 3: If $A$ and $B$ are row-equivalent $m \times n$ matrices, the homogeneous systems of linear equations $AX = 0$ and $BX = 0$ have exactly the same solutions.

The proof intimately uses that the systems in question are homogeneous, but I'm not sure I can quite see how. I am also led to wonder about how this homogeneous assumption can be relaxed (perhaps I am jumping the gun and HK will discuss this). I wonder about these two questions at any rate:

  1. Can someone explicate exactly how the homogeneous assumption is used? In the crucial step, Hoffman and Kunze write (after reducing the proof by induction to equivalence of systems differing by just one row operation)

No matter which of the three types the row operation is, each equation in the system $A_{j+1}X$ = 0 will be a linear combination of the equations in the system $AX = 0$.

Now my confusion is that row operations only act on the matrix $A$ of coefficients, not the vector $Y$ of inhomogeneous scalars. Is the point that, in general, we would also have to do the row operations on $Y$ in order to obtain a system of equations which is a linear combination of the original equations, but that here since $Y = 0$ this action is trivial?

  1. Theorem 3 can perhaps be rewritten as "if two systems of equations are homogeneous then row equivalence of their coefficient matrices implies that the systems are equivalent" (here we recall that systems of equations being equivalent means that their constituent equations can be written as linear combinations of one another). My question here is (a) does the converse hold ("if two systems of equations are homogeneous then their equivalence implies row equivalence of their coefficient matrices") and (b) if (a) is true, is there an if and only if statement to be similarly made if we relax homogeneity (presumably we'd need to add a hypothesis about the $Y$ vector)?

If it's possible for any answer to be really clear about all the possible cases at play here I would greatly appreciate it, as I know there are some pathological cases involving inconsistent systems which sometimes confuse me.

Best Answer

I’ll first offer the most direct answer I can think of given the stuff they explain up to this stage, and then I’ll provide what I think is the more conceptual, and honestly clearer statement.

Suppose first for simplicity that $A\to B$ by a single row operation (and say they are $m\times n$ matrices). Let’s say for concreteness the row operation is type (1), i.e multiplication of a fixed row $i$ of $A$ by a non-zero scalar $c$. Then for any vector $X$, we have \begin{align} AX= \begin{pmatrix} \sum_{j=1}^na_{1j}x_j\\ \vdots\\ \sum_{j=1}^na_{ij}x_j\\ \vdots\\ \sum_{j=1}^na_{mj}x_j \end{pmatrix}, \quad\text{and}\quad BX= \begin{pmatrix} \sum_{j=1}^na_{1j}x_j\\ \vdots\\ c\cdot\sum_{j=1}^na_{ij}x_j\\ \vdots\\ \sum_{j=1}^na_{mj}x_j \end{pmatrix}. \end{align} So, $AX=0$ obviously implies $BX=0$. Conversely, $BX=0$ and $c\neq 0$ (so that we can divide the $i^{th}$ equation in $BX=0$ by $c$) imply $AX=0$. Similarly, for a type (3) operation of swapping rows, obviously $AX=0$ if and only if $BX=0$. Likewise you can check for a type (2) operation of adding a non-zero multiple of one row to another. Therefore, if $A\to B$ by a single row operation, then $AX=0$ if and only if $BX=0$, meaning they have the same set of roots. Finally, for general row-equivalent matrices $A=A_0\to\dots\to A_f=B$, at each stage they have the same set of roots, so $A$ and $B have the same set of roots (use induction to be precise).

So, yes, row operations are done on the coefficients of the matrix, but once you write out the equation $AX=0$ and compare with the equation $BX=0$, you see that a vector $X$ satisfies one if and only if it satisfies the other. The $0$ on the RHS is very important, because as a very trivial example, consider $A,B$ to be $1\times 1$ matrices, i.e just numbers. Suppose $A=1$. Let’s perform a row operation of multiplication by a non-zero scalar $c$, so $B=c$. The solution of $AX=1$ is simply $X=1$. On the other hand, the solution to $BX=1$ is $X=\frac{1}{c}$. So you see if you mess with the RHS of the equations, you no longer get the same solutions.


I haven’t read Hoffman and Kunze, so I’m not sure if they explain it in the following way later on. But it is a theorem that $A,B$ are row-equivalent if and only if there is an invertible matrix $R$ such that $A=RB$; this is the underlying reason for why $A,B$ have the same null space.

Let’s speak more generally in terms of linear maps between vector spaces, because I find that matrices are just unnecessarily specific (and this is especially true for this forward implication). Consider two linear maps $\alpha,\beta:V\to W$, and suppose there is an isomorphism $\rho:W\to W$ such that $\alpha=\rho\circ\beta$. Then, by definition, for any $x\in V$, we have $\alpha(x)=0$ if and only if $\rho(\beta(x))=0$. Since $\rho$ is linear it means $\rho(0)=0$, and since $\rho$ is an isomorphism, we have $\rho(\beta(x))=0$ if and only if $\beta(x)=0$. Therefore, $\alpha(x)=0$ if and only if $\beta(x)=0$. In other words, $\alpha,\beta$ have the same kernel/nullspace.


Conversely, suppose $\alpha,\beta:V\to W$ have the same null space $N\subset V$. The first isomorphism theorem for linear maps (which is one of the most fundamental results, leading to things like rank nullity and so on) tells us that $\alpha$ and $\beta$ descend to canonical isomorphisms $\tilde{\alpha}:V/N\to \text{image}(\alpha)$ and $\tilde{\beta}:V/N\to \text{image}(\beta)$ such that $\alpha=\tilde{\alpha}\circ\pi$ and $\beta=\tilde{\beta}\circ\pi$, where $\pi:V\to V/N$ is the canonical projection to the quotient. These equations can be rearranged into \begin{align} \alpha&=\tilde{\alpha}\circ\pi=\tilde{\alpha}\circ\left[(\tilde{\beta})^{-1}\circ\beta\right]=\rho_0\circ\beta, \end{align} where $\rho_0=\tilde{\alpha}\circ\tilde{\beta}^{-1}:\text{image}(\beta)\to\text{image}(\alpha)$ is an isomorphism. This is already a good partial converse.

If we wish to go further and extend $\rho_0$ to be an isomorphism $\rho:W\to W$, then let us make the further assumption that $W$ is finite-dimensional. Then, we can fix complementary subspaces $K_{\alpha},K_{\beta}$ for $\text{image}(\alpha),\text{image}(\beta)$ in $W$ respectively, i.e \begin{align} W=\text{image}(\alpha)\oplus K_{\alpha},\quad\text{and}\quad W=\text{image}(\beta)\oplus K_{\beta}. \end{align} Then, by finite-dimensionality of $W$, and the fact that the images of $\alpha,\beta$ have the same dimension (since they are isomorphic), it follows by subtraction of dimensions (which we can do since it is finite) \begin{align} \dim K_{\alpha}=\dim W-\dim\text{image}(\alpha)=\dim W-\text{image}(\beta)=\dim K_{\beta}. \end{align} Thus, there is an isomorphism $\rho_1:K_{\alpha}\to K_{\beta}$. Thus, putting $\rho_0$ and $\rho_1$ together, we get an isomorphism $\rho:W\to W$ which extends $\rho_0$. Therefore, $\alpha=\rho\circ\beta$, which proves our claim.

Maybe the first isomorphism theorem and so complementary subspaces might be too abstract, so here’s a more down-to-earth explanation of this converse using bases. Assume $W$ is finite-dimensional as above, and for further simplicity assume $V$ is also finite-dimensional (we don’t really need this though… atleast if you’re like me and happy to assume every vector space has a basis). Suppose that $\alpha,\beta$ both have the same null space $N\subset V$; we wish to show that there is an isomorphism $\rho:W\to W$ such that $\alpha=\rho\circ\beta$.

Fix a basis $\{v_1,\dots, v_n\}$ for $N$, and extend it to a basis $\{v_1,\dots, v_n; u_1,\dots, u_r\}$ for $V$ ($n$ for nullity, $r$ for rank :). Then, $\alpha,\beta$ send the first $n$ vectors to $0$, while the remaining $r$ vectors are mapped to bases of their images, i.e $\{\alpha(u_1),\dots, \alpha(u_r)\}$ and $\{\beta(u_{1}),\dots,\beta(u_r)\}$ form bases for $\text{image}(\alpha)$ and $\text{image}(\beta)$ respectively. Extend these to bases for $W$ as $\{\alpha(u_1),\dots, \alpha(u_r),a_1,\dots, a_m\}$ and $\{\beta(u_1),\dots,\beta(u_r),b_1,\dots, b_m\}$. Finally, let $\rho:W\to W$ be the unique linear map which sends, for each $1\leq i\leq r$ and $1\leq j\leq m$, the vectors $\alpha(u_i)$ to $\beta(u_i)$ and $a_j$ to $b_j$. Then, since $\rho$ sends a basis to a basis, it is an isomorphism, and from the definitions we have $\alpha=\rho\circ\beta$ (these two linear maps send the vectors $v_1,\dots, v_k$ to $0$ and they are by definition equal on the vectors $u_1,\dots, u_r$, so $\alpha$ and $\rho\circ\beta$ agree on the basis $\{v_1,\dots, v_n;u_1,\dots, u_r\}$ of $V$, and thus are equal on all of $V$).


So, to summarize, yes we do have an if and only if condition, but for the ‘if’ part, I didn’t prove it using matrix language because I find it quite annoying; working more generally with linear transformations clarifies the setting much more. And regarding your final followup, I would say the ‘best’ (and also clearest) one can do is that the images of $\alpha,\beta$ are isomorphic (and in fact we have a canonical isomorphism $\rho_0$ as described above). Also, you may wish to think about the following set-theoretic property:

Let $\alpha,\beta:V\to W$ by functions. If there is an injective map $\rho:V\to W$ and a subset $E$ such that $\rho$ maps $E$ bijectively onto itself and $\alpha=\rho\circ \beta$, then the preimages $\alpha^{-1}(E)$ and $\beta^{-1}(E)$ are equal.

In this statement, we don’t need any linearity assumptions. Anyway, in linear algebra, one place this might come up is if $\rho$ is an isomorphism which maps a certain subspace isomorphically onto itself (for example a rotation about the $z$-axis maps the $x$-$y$ plane bijectively onto itself).

Related Question