Definition of degenerate multivariate normal distribution

covariancenormal distributionprobabilityprobability distributionsstatistics

I am reading some notes and having some trouble with the definition of multivariate normal distribution when the covariance matrix is not invertible. I will state my understanding below, and hopefully someone can chime in with some words of wisdom.

Suppose $\Sigma$ is an $n\times n$ matrix. When $\Sigma$ is invertible, we say that a random vector $X$ has multivariate normal distribution with mean $0$ and covariance $\Sigma$ if it has density given by
$f_X(x)=\frac{1}{(2\pi)^{n/2}(\det{\Sigma})^{1/2}}\exp(-\frac{1}{2}x^T\Sigma^{-1}x)$

Now, when $\Sigma$ is not invertible, then clearly the above density function is not defined. The notes mention the Cramer-Wold device can be used to define $N(0,\Sigma)$ in this case, and moves on without explicitly doing so.

Could somebody please give a simplistic explanation/definition of $N(0,\Sigma)$ when $\Sigma$ is not invertible?

Remark: For the univariate normal distribution, I understand that $N(0,0)$ corresponds to the degenerate distribution $\delta_0$. By degenerate distribution, I mean it is $0$ with probability $1$. I can not see how this would work in higher dimensions though!

Best Answer

An important property of the multivariate normal distribution, is that if $X$ has a n-dimensional normal distribution, then $BX+c$ has a m-dimensional normal distribution for any $m\times n$ matrix $B$ and $m$ dimensional column vector $c$ .

It can be shown that $$\mathbb{E}[BX + c] = B\mathbb{E}[X]+c \quad \text{ and } \quad \text{Var}(BX+c)=B\text{Var}(X)B^T$$

Using this we can characterize the multivariate normal distribution as an affine transformation of independent $N(0,1)$ variables.

The construction goes as follows: Suppose we want to construct a normal distribution with mean vector $\mu$ and covariance matrix $\Sigma$. Consider $n$ independent $N(0,1)$ variables, then $(X_1,...,X_n)$ has a n-dimensional normal distribution with mean $0$ and covariance matirx $I$ (the identity matrix). Consider the transformation $$Y=\Sigma^{1/2}X + \mu,$$ where $\Sigma^{1/2}$ is the symmetric square root of $\Sigma$ (see https://en.wikipedia.org/wiki/Square_root_of_a_matrix#By_diagonalization)

Y has a n-dimensional normal distribution with mean $\mu$ and covariance matrix $$Var(Y) = \Sigma^{1/2} I (\Sigma^{1/2})^T=\Sigma.$$

An interesting consequence is, that if $\Sigma$ has rank $k$, then $Y$ is concentrated on a $k$ dimensional affine subspace of $\mathbb{R}^n$ and if $k<n$ then $Y$ is concentrated on a set of Lebesgue measure $0$, which means that a density cannot exist.

Related Question