Solved – Relation between Two Sample Hotelling’s T-test and Mahalanobis Distance

hotelling-t2linear algebramultivariate analysis

Mahalanobis distance is a measure of distance between a point and distribution. So if we want to check if a point belongs to a particular distribution or not, we can use Hotelling's T-test, which is squared Mahalanobis distance. But if we have two sample distribution and we want to check if they belong to the same group or not, we can use two sample Hotelling's T-test, that is,

$ T^2$ = $n(X-Y)^T$$(X-Y)$ /$S$

(assuming same number of samples), where $S$=$Sx$+$Sy$ is the pooled covariance. Now my question is this, Is there any relation between Two Sample Hotelling's T-test and Mahalanobis Distance (similar to 1 sample H T-test and MD)? How are they mathematically related?

My thinking is that, there is some relation (equality) between them but I can't get my head around it. Any guidance would be greatly appreciated. Thanks

Best Answer

Assuming $X \sim \operatorname{MVN}(\boldsymbol{\mu}, \Sigma)$, ie it follows a multivariate normal distribution with known mean $\boldsymbol{\mu}$ and variance $\Sigma$, $(\mathbf{X} - \boldsymbol{\mu}) \Sigma^{-1}(\mathbf{X} - \boldsymbol{\mu})$ (which is the Mahalanobis distance squared), follows a Chi squared distribution with $p$ degrees of freedom, where $p$ is the number of dimensions in $X$.

However, if we have to estimate $\Sigma$ from $n$ samples, we denote it as $\mathbf{S}$, which follows a Wishart distribution with $n$ degrees of freedom. Then, $(\mathbf{X} - \boldsymbol{\mu}) S^{-1}(\mathbf{X} - \boldsymbol{\mu})$ follows the Hotelling $T^2$ distribution with $p$ and $n$ degrees of freedom.

So while the statistics have similar forms, they are used in different contexts, although they will be asymptotically similar.

Note that this is directly analogous to in the univariate case, where $X \sim N(\mu, \sigma^2)$. In this case, with known $\sigma^2$, $z = \frac{X-\mu}{\sigma / \sqrt{n}} \sim N(0, 1)$, ie the standard normal distribution. However, if we have to estimate $\sigma^2$ from data, the scaled estimator follows a chi squared distribution, ie $\frac{(n-1)s^2}{\sigma^2} \sim \chi^2(n-1)$. Then, $t = \frac{X-\mu}{s / \sqrt{n}} \sim t(n-1)$.

So, both the Student $t$ and the Hotelling $t$ represent "more uncertain" versions (ie sampling distributions) of their respective "certain" versions. However, both asymptotically approach their "certain" versions, namely $\chi^2(p)$ and and $N(0, 1)$ respectively, as $n$, the number of samples, approaches infinity.