[Math] Multivariate normal distribution density function

probabilityprobability distributionsprobability theory

I was just reading the wikipedia article about Multivariate normal distribution: http://en.wikipedia.org/wiki/Multivariate_normal_distribution

I use a little bit different notation. If $X_1,\ldots,X_n$ are independent $\mathcal{N}(0,1)$ random variables, $X=(X_1,\ldots,X_n)$ and $m=(m_1, \ldots, m_n)$ are $n$-dimensional vectors and $B$ is a $n\times n$ matrix, then $Y=m+BX$ has density function

$$f_Y(x)=\frac{1}{\sqrt{(2\pi)^n|\boldsymbol\Sigma|}}
\exp\left(-\frac{1}{2}({x}-{m})^T{\boldsymbol\Sigma}^{-1}({x}-{m})
\right)$$

where $\Sigma=BB^T$

Question: In the wikipedia article its part of the definition that $\Sigma$ is the covariance matric from $Y$, i.e $\sum_{ij}=\operatorname{Cov}(Y_i,Y_j)$, but why? How can this be proved.

Best Answer

By the definition, any covariance matrix $\Sigma$ consists of $\Sigma_{i,j} = Cov(X_i, X_j) = E[(X_i - \mu_i)(X_j - \mu_j)]$, where randoms $X_i$ and $X_j$ have expectations $E[X_i] = \mu_i$ and $E[X_j] = \mu_j$ accordingly. So your problem is to prove $Cov(m_i + \mathbf{b}_i \mathbf{X}, m_j + \mathbf{b}_j \mathbf{X}) = (B \cdot \Sigma \cdot B^\top)_{i,j}$, where $B$ is non-singular matrix with rows $\mathbf{b}_i = (b_{i,1}, \ldots, b_{i,n})$, $1 \le i \le n$ and $\mathbf{X} = (X_1, \ldots, X_n)$ is $n$-dimensional random, $(m_1, \ldots, m_n)$ is vector of scalars.

Let $\mathbf{X}$ has the covariance $n \times n$-matrix $\Sigma$, which is symmetric.

Note, $E[m_i + \mathbf{b}_i \mathbf{X}] = m_i + \mathbf{b}_i \mathbf{\mu}$, where $\mathbf{\mu} = (\mu_1, \ldots, \mu_n)$. Expectation operator $E$ is linear. So, the proof:

$Cov(m_i + \mathbf{b}_i \mathbf{X}, m_j + \mathbf{b}_j \mathbf{X}) = E[(\mathbf{b}_i (\mathbf{X} - \mathbf{\mu}))(\mathbf{b}_j (\mathbf{X} - \mathbf{\mu}))] = E[\sum_{t=1}^n \sum_{p=1}^n b_{i,t} b_{j,p} (X_t - \mu_t)(X_p - \mu_p)] = \sum_{t=1}^n \sum_{p=1}^n b_{i,t} b_{j,p} E[(X_t - \mu_t)(X_p - \mu_p)] = \sum_{t=1}^n b_{i,t} \left(\sum_{p=1}^n b_{j,p} \Sigma_{p,t}\right) = \sum_{t=1}^n b_{i,t} \left( B \cdot \Sigma \right)_{j,t} = \left(B \cdot \Sigma \cdot B^\top\right)_{i,j}$

So the covariance matrix of $\mathbf{m} + B \cdot \mathbf{X}$ is $B \cdot \Sigma \cdot B^\top$, where $Cov(\mathbf{X}) = \Sigma$.

Related Question