Probability – Conditional Distributions of the Multivariate Normal

normal distributionprobability

Wikipedia gives details on the conditional distribution of the multivariate normal:

If $\mu$ and $\Sigma$ are partitioned as follows

$\boldsymbol\mu = \begin{bmatrix} \boldsymbol\mu_1 \\ \boldsymbol\mu_2 \end{bmatrix}$

$\boldsymbol\Sigma = \begin{bmatrix} \boldsymbol\Sigma_{11} & \boldsymbol\Sigma_{12} \\ \boldsymbol\Sigma_{21} & \boldsymbol\Sigma_{22} \end{bmatrix} \quad$

then, the distribution of $x_1$ conditional on $x_2= a$ is
multivariate normal $(x_1|x_2=a) \sim N(\bar{\mu}, \bar{\Sigma})$
where

$\bar{\boldsymbol\mu} = \boldsymbol\mu_1 + \boldsymbol\Sigma_{12} \boldsymbol\Sigma_{22}^{-1} \left( \mathbf{a} – \boldsymbol\mu_2 \right) $

and covariance matrix

$\overline{\boldsymbol\Sigma} = \boldsymbol\Sigma_{11} – \boldsymbol\Sigma_{12} \boldsymbol\Sigma_{22}^{-1} \boldsymbol\Sigma_{21}. $

How can I prove this result? Wikipedia cites Eaton, Morris L. (1983). Multivariate Statistics: a Vector Space Approach. John Wiley and Sons. pp. 116–117., but I don't have this book handy…

Best Answer

Consider the following transformation $$\begin{align} Y_1 &=X_1-\Sigma_{12}\Sigma_{22}^{-1}X_2 \\ Y_2 &=X_2\end{align}$$ Then $$\begin{pmatrix} Y_1 \\ Y_2 \end{pmatrix} \sim\mathcal{N_p}\left[\begin{pmatrix} \mu_1-\Sigma_{12}\Sigma_{22}^{-1}\mu_2 \\ \mu_2 \end{pmatrix} ,\begin{pmatrix} \bar \Sigma & 0 \\ 0 & \Sigma_{22} \end{pmatrix} \right] $$

Hence the PDF of $\begin{pmatrix} Y_1 \\ Y_2 \end{pmatrix} $ is $$\begin{align}n(y_1,y_2) &= n(y_1|\mu_1-\Sigma_{12}\Sigma_{22}^{-1}\mu_2, \bar \Sigma)\cdot n(y_2| \mu_2,\Sigma_{22}) \end{align}$$ Since $Y_1 \sim \mathcal{N_{p1}}(\mu_1-\Sigma_{12}\Sigma_{22}^{-1}\mu_2, \bar \Sigma)$ and $Y_2 \sim \mathcal{N_{p-p1}}(\mu_2,\Sigma_{22})$ independently, as $Cov(Y_1,Y_2)=0$.

The PDF of $\begin{pmatrix} X_1 \\ X_2 \end{pmatrix} $ will be obtained by replacing $y_1$ by $x_1-\Sigma_{12}\Sigma_{22}^{-1}x_2$ and $y_2$ by $x_2$. Note that here jacobian of the transformation is unity.

Hence the PDF of $\begin{pmatrix} X_1 \\ X_2 \end{pmatrix} $ is $$\begin{align}n(x_1,x_2) &= n((x_1-\Sigma_{12}\Sigma_{22}^{-1}x_2)|\mu_1-\Sigma_{12}\Sigma_{22}^{-1}\mu_2, \bar \Sigma)\cdot n(x_2| \mu_2,\Sigma_{22}) \end{align}$$

Hence the conditional PDF of $X_1$ given $X_2=a$ is

$$\begin{align}f_{X_1|X_2=a}(x_1) &= \dfrac{n(x_1,a)}{n(a|\mu_2,\Sigma_{22})} \\ &=n(x_1-\Sigma_{12}\Sigma_{22}^{-1}a|\mu_2-\Sigma_{12}\Sigma_{22}^{-1}\mu_2, \bar \Sigma) \\ &= \dfrac{1}{(2 \pi)^{p/2}\sqrt{|\bar \Sigma|}}\exp\left( \frac{1}{2}\left[ (x_1-\mu_1- \Sigma_{12}\Sigma_{22}^{-1}(a-\mu_2))'\bar \Sigma^{-1}(x_1-\mu_1- \Sigma_{12}\Sigma_{22}^{-1}(a-\mu_2)) \right]\right) \end{align}$$

Hence $X_1|X_2=a \sim \mathcal{N_{p1}}(\bar \mu, \bar \Sigma)$.

Related Question