How to Calculate Probability for Multivariate Normal Distribution Data Point

chi-squared-distributioncumulative distribution functionmultivariate normal distributionprobabilitypython

Okay, so I have a $n$-dimensional normal distribution with the means and co-variance matrix defined (for now we can assume that these are the true distribution parameters, not estimates). For a given data point I want to calculate the probability that this point belongs to this distribution.

I believe I would be interested in the probability of generating a point "at least as unlikely" as the given data point. In a $1D$ normal distribution case this would be the area under the "two tails" of the PDF. E.g. $1-(CDF(x)-CDF(\mu-x))$. Now I want to calculate this probability for a general $n$-dimensional normal distribution. I believe maybe this should be $1$ – "volume inside the hyper-sphere in Mahalanobis distance space defined by the radius given by the Mahalanobis distance of the data point from the mean".

Question 1: Is this interpretation correct?

Question 2: How do I calculate this?

Question 3: Does the analysis change if the mean and covariance matrix are only estimates of the true parameters?

Question 4: Is there an easy way to do this is python?

My best attempt at solving this question myself (but my statistics knowledge isn't very good, so I am asking for correctness confirmation, it is very important I get this correct)
According to wikipedia:

$$(x-\mu)^T\Sigma^{-1}(x-\mu) \le \chi_k^2(p)$$

So I can first calculate the Mahalanobis distance as above (MD), and then maybe I just have to calculate the CDF of the chi-squared distribution at MD, and take $1$ minus this.

I have no idea if this is correct, but currently my best guess.

Thanks in advance.

Best Answer

Yeah, that sounds right. If you have parameters $\mu$ and $\Sigma$ and data point $x$, then the set of all data points that are less likely than $x$ are the ones that have less density, or in other words higher Mahalanobis distance: \begin{align*} (2\pi)^{-k/2}|\Sigma|^{-1/2}e^{\left[-\frac{1}{2}(y-\mu)^T\Sigma^{-1}(y-\mu) \right]} &\le (2\pi)^{-k/2}|\Sigma|^{-1/2}e^{\left[-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu) \right]} \iff\\ e^{\left[-\frac{1}{2}(y-\mu)^T\Sigma^{-1}(y-\mu) \right]} &\le e^{\left[-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu) \right]} \iff \\ (y-\mu)^T\Sigma^{-1}(y-\mu) &\ge (x-\mu)^T\Sigma^{-1}(x-\mu). \end{align*}

So you want the probability of some unseen $Y$ being "better" than your observed $x$, which is the same as $Q = (Y-\mu)^T\Sigma^{-1}(Y-\mu)$ being bigger than $(x-\mu)^T\Sigma^{-1}(x-\mu)$. And you are correct in pointing out that $Q \sim \chi^2(k)$. So $$ \mathbb{P}\left[(y-\mu)^T\Sigma^{-1}(y-\mu) \ge (x-\mu)^T\Sigma^{-1}(x-\mu)\right] = 1- \mathbb{P}\left[Q \le (x-\mu)^T\Sigma^{-1}(x-\mu)\right]. $$

In python you can calculate that wth something like this:

from scipy import stats
import numpy as np
x = np.array([1,1,1])
mu = np.array([0,0,0])
sigma = np.array([[1,0,0],[0,1,0],[0,0,1]])
m_dist_x = np.dot((x-mu).transpose(),np.linalg.inv(sigma))
m_dist_x = np.dot(m_dist_x, (x-mu))
1-stats.chi2.cdf(m_dist_x, 3)
Related Question