[Math] How to calculate the Mahalanobis distance

linear algebraprobabilitystatistics

I am really stuck on calculating the Mahalanobis distance. I have two vectors, and I want to find the Mahalanobis distance between them. Wikipedia gives me the formula of
$$
d\left(\vec{x}, \vec{y}\right) = \sqrt{\left(\vec{x}-\vec{y}\right)^\top S^{-1} \left(\vec{x}-\vec{y}\right) }
$$

Suppose my $\vec{y}$ is $(1,9,10)$ and my $\vec{x}$ is $(17, 8, 26)$ (These are just random), well $\vec{x}-\vec{y}$ is really easy to calculate (it would be $(16, -1, 16)$), but how in the world do I calculate $S^{-1}$. I understand that it is a covariance matrix, but that is about it. I see an "X" everywhere, but that only seems like one vector to me. What do I do with the Y vector? Can someone briefly walk me through this one example? Or any other example involving two vectors.

Best Answer

I think, there is a misconception in that you are thinking, that simply between two points there can be a mahalanobis-distance in the same way as there is an euclidean distance. For instance, in the above case, the euclidean-distance can simply be compute if $S$ is assumed the identity matrix and thus $S^{-1}$ the same. The difference, which Mahalanobis introduced is, that he would compute a distance, where the measurements (plural!) are taken with a correlated metric, so to say. So $S$ is not assumed to be the identity matrix (which can be understood as special correlation-matrix where all correlations are zero), but where the metric itself is given in correlated coordinates, aka correlation-values in the $S $ matrix, which are also cosines betwen oblique coordinate axes (in the euclidean metric they are orthogonal and their cosine/their correlation is zero by definition of the euclidean).

But now - what correlation does Mahalanobis assume? This are the empirical correlations between the x and the y - thus we need that correlations from external knowledge or from the data itself. So I'd say in answering to your problem, that the attempt to use Mahalanobis distance requires empirical correlations, thus a multitude of x- and y measurements, such that we can compute such correlations/ such a metric: it does not make sense to talk of Mahalanobis-distance without a base for actual correlations/angles between the axes of the coordinatesystem/the measure of the obliqueness in the metric.

Edit:When all correlations are zero, $S$ is diagonal, but not necessarily identity matrix. For $S$ to be equal to identity matrix all sampled variables must have equal value of standard deviation. The $i$-th diagonal element of the matrix $S$ represents the metric for $i$-th variable in units of $\sigma_i$ (or proportional to $\sigma_i$)

Related Question