Statistics – Proof That the Mahalanobis Distance is Non-Negative

mahalanobis distancemetric-spacesproof-writingstatistics

I was just introduced to the Mahalanobis distance between two vectors $\mathrm{\mathbf{X}}$ and $\mathrm{\mathbf{Y}}$ of random variables:

$$|| \mathrm{\mathbf{X}} – \mathrm{\mathbf{Y}}||_{\Sigma} = ((\mathrm{\mathbf{X}} – \mathrm{\mathbf{Y}})^T \Sigma^{-1}(\mathrm{\mathbf{X}} – \mathrm{\mathbf{Y}}))^{1/2},$$

where $\Sigma$ is the covariance matrix.

As I understand it, the 4 properties that a function $d(x,y)$ must satisfy in order to be a metric are as follows:

  1. $d(x, y) \ge 0$
  2. $d(x, y) = 0 \Longleftrightarrow x = y$
  3. $d(x, y) = d(y, x)$
  4. $d(x, z) \le d(x, y) + d(y, z)$

I only have an introductory-level knowledge of statistics, so I'm wondering how it is that the Mahalanobis distance satisfies property 1? Ignoring the square root, why is it that $(\mathrm{\mathbf{X}} – \mathrm{\mathbf{Y}})^T \Sigma^{-1}(\mathrm{\mathbf{X}} – \mathrm{\mathbf{Y}})$ can't be negative?

I would greatly appreciate it if people could please take the time to clarify this.

Best Answer

This is because the $\Sigma^{-1}$ matrix (inverse of the covariance matrix) is symmetric definite positive.

Once that you have a symmetric positive definite (SPD) matrix $S$, it is easy to define:

  • a scalar product $$\langle v,w\rangle_S=\langle v,Sw\rangle$$ (where $\langle.,.\rangle$ is the usual scalar product)
  • an associated distance: $$ d(v,w)=\|v-w\|_S^2=\langle v-w,v-w\rangle_S=\langle v-w,S(v-w)\rangle $$

Other references: