[Math] How to one prove that mahalanobis distance is a metric

data miningmetric-spaces

How can one prove that mahalanobis distance is a metric? How can one show that these four properties of a metric are valid for mahalanobis distance?

1) d(x, y) ≥ 0 (non-negativity, or separation axiom)

2) d(x, y) = 0 if and only if x = y (identity of indiscernibles, or coincidence axiom)

3) d(x, y) = d(y, x) (symmetry)

4) d(x, z) ≤ d(x, y) + d(y, z) (subadditivity / triangle inequality).

I thought that mahalanobis distance is just a rescaling of each points according to the standard deviations of its dimensions. Let me elaborate:

Say you have thousands of vectors each having coordinates x and y. X and Y are two random variables. They have means and variances as well as covariances. You fit a line to show the correlation between X and Y. Then you draw another one perpendicular to that line. You called the intersection the new origin and rescaled the regression line using the covariance. You also rescaled the perpendicular line using the variances of X and Y. And since after that rescaling the distance between every two points will be euclidian distance, and since euclidian distance is a metric, mahalanobis is also metric.

I dont know how to prove those properties. I tried but i couldnt so i came up with my own logic. So dont laugh, it makes sense to me:)

Best Answer

Let us define the Mahalanobis inner product of two vectors $x, y \in \mathbb{R}^n$ with respect to the real, symmetric, positive semi-definite matrix $S$ as $(x, y)_m = x^T S y$. Then, from symmetry of $S$ it follows that $(x, y)_m = (y, x)_m$. Further, if we define Mahalanobis norm as $||x||_m = \sqrt{(x, x)_m}$ then we can show that $d_m(x, y) = ||x - y||_m$.

We then use the Cauchy-Schwarz inequality (proved later) $(x, y)_m \le ||x||_m ||y||_m$ to show that $||x + y||_m \le ||x||_m + ||y||_m$.

Finally, since $d_m(x, z) = ||x - z||_m = ||x - y + y -z||_m \le ||x - y||_m + ||y - z||_m = d_m(x, y)+d_m(y,z)$.

To prove Cauchy-Schwarz inequality, start with $||x - \lambda y||_m^2 \ge 0$ for any real $\lambda$. This is true because of positive semi-definite property of $S$. Choose $\lambda = (x, y)_m/||x||_m^2$.

Related Question