[Math] Matrices equipped with the rank distance form a metric space

general-topologylinear algebrametric-spacesreal-analysis

Right now, I'm taking a basic introductory analysis course. I came across this on wikipedia and I'm Really struggling to prove it is a metric space. (Need help with the 3rd axiom of metric space the most and making the 2nd one more rigorous/less handwaving)

The set of all $m$ by $b$ matrices over some field is a metric space with respect to the rank distance $d(X,Y)=\operatorname{rank}(Y-X) $

Facts that I know:
So I know that in order to prove something is a metric space

  1. $d(X,Y)=0$ iff $X=Y$ (have to prove both ways)

  2. $d(X,Y)=d(Y,X)$ (Symmetry)

  3. $d(X,Y)\leqslant d(X,Z) +d(Z,Y)$ (Triangle inequality).

Also, it has been a long time since I have taken linear algebra", but I now that to find the rank, you reduce a matrix to its row echelon form and the number of nonzero rows is the rank (or I think number of linearly independent rows, again it has been awhile).

My attempt:

  1. If $Y=X$, We have $\operatorname{rank}(Y-X)=\operatorname{rank}(Y-Y)=\operatorname{rank}(O)$ (where $O$ is the null matrix) which is clearly equal to $0$. Thus $d(X,Y)=0$. Proving other direction if $d(X,Y)=0$, we have $\operatorname{rank}(Y-X)=0$. In this case $Y$ must necessarily equal $X$. If $Y$ does not equal $X$, rank must be at least $1$.

  2. Why must $d(X,Y)=d(Y,X)$? The matrix $X-Y$ and $Y-X$ only differ by signs. That doesn't change rank. Hence $\operatorname{rank}(Y-X)=\operatorname{rank}(X-Y)$. (feel like I hand waved this explanation a bit).

  3. Why must $d(X,Y)\leqslant d(X,Z)+ d(Z,Y)$? Need to show: $$\operatorname{rank}(Y-X)\leqslant\operatorname{rank}(Z-X)+ \operatorname{rank}(Y-Z)$$where $X,Y,Z$ are $m \times n$ matrices.

    This one I'm not sure at all how to approach this. Why must $\operatorname{rank}(Y-X)\leqslant \operatorname{rank}(Z-X)+\operatorname{rank}(Y-Z)$? I'm not sure why this must be the case and how to formally show this.

Best Answer

Your answers for the first and second points are correct. The second is not hand-wavy by any means, you have said the crucial thing, namely that $X-Y$ is a scalar multiple of $Y-X$, so their rank is the same.

As for the third one, if $Z - X = a$ and $Y - Z = b$, then you essentially have to prove that $\operatorname{rank}(a+b) \leq \operatorname{rank}(a) + \operatorname{rank}(b)$.

However, note that if some set of vectors span the columns of $a$, and another set of vectors span the columns of $b$, then the union of these vectors spans the columns of $a+b$. From here, I urge you to use the definition of rank to understand why this statement follows.

EDIT : Suppose that $\{e_i\}$ span the columns of $a$, and $\{f_j\}$ span the columns of $b$. My claim is that $\{e_i\} \cup \{f_j\}$ span the columns of $a+b$.

It's easy to see why. Suppose $A$ is a column of $a+b$, then we know that $A= A_a + A_b$, where $A_a$ and $A_b$ are the corresponding columns of $a$ and $b$ which were added to give the column $A$ of $a+b$.

Now, $A_a$ is spanned by the $\{e_i\}$, so there exist constants $c_i$ such that $A_a = \sum_{i} c_ie_i$. Simiarly, since $A_b$ is spanned by $\{f_j\}$, we get that $A_b = \sum_j d_jf_j$ for some constants $d_j$.

Adding these two, gives that $A = \sum_i c_ie_i + \sum_j {d_jf_j}$. Hence, you can see that $A$ is spanned by $\{e_i\} \cup \{f_j\}$. Hence, the rank of $A$ is less than the size of $\{e_i\} \cup \{f_j\}$, which is less than $\operatorname{rank} a + \operatorname{rank} b$. Hence, the inequality follows.

Related Question