[Math] Generating marginal PDF of multivariate Gaussian

probability distributionsprobability theory

I am wondering what the proof is of the statement on Wikipedia's page on marginals of multivariate Gaussian density functions; To obtain the marginal distribution over a subset of multivariate normal random variables, one only needs to drop the irrelevant variables from the mean vector and the covariance matrix. I can plainly visualize it for independent variables, as the terms in the covariance matrix that we will drop out are all either zero, or the variance of the "irrelevant variable", however for variables that are not independent, I'm confused as to why the cross-terms simply go away in the covariance matrix.

Is this statement only valid for independent variables, or is there an elegant proof that yet eludes me?

Thank you!

Best Answer

It is easiest to look at this from the characteristic function point of view.

$\phi(t_1, t_2, \ldots, t_n) = \mathbb{E}( \exp(i \langle \vec{t}, \vec{x} \rangle)) = \exp( i \langle \vec{t}, \vec{\mu}\rangle - \frac{1}{2} \langle \vec{t}, \Sigma \vec{t}\rangle)$.

Characteristic function of the marginal distribution is the characteristic function of the original multi-normal with the marginalized $t$-s being zero.

Clearly setting $t_i = 0$ is equivalent to dropping out $i$-th variable.