Solved – Multivariate normal posterior

bayesianlinear algebramatrixnormal distributionposterior

This is a very simple question but I can't find the derivation anywhere on the internet or in a book. I would like to see the derivation of how one Bayesian updates a multivariate normal distribution. For example: imagine that

$$ \begin{array}{rcl}
\mathbb{P}({\bf x}|{\bf μ},{\bf Σ}) & = & N({\bf \mu}, {\bf \Sigma}) \\
\mathbb{P}({\bf \mu}) &= & N({\bf \mu_0}, {\bf \Sigma_0})\,. \end{array}
$$

After observing a set of ${\bf x_1 … x_n}$, I would like to compute $\mathbb{P}({\bf \mu | x_1 … x_n})$. I know that the answer is $\mathbb{P}({\bf \mu | x_1 … x_n}) = N({\bf \mu_n}, {\bf \Sigma_n})$ where

$$
\begin{array}{rcl} \bf \mu_n &=& \displaystyle\Sigma_0 \left(\Sigma_0 + \frac{1}{n}\Sigma\right)^{-1}\left(\frac{1}{n}\sum_{i=1}^{n}{\bf x_i}\right) + \frac{1}{n}\Sigma\left(\Sigma_0+\frac{1}{n}\Sigma\right)^{-1}\mu_0 \\
\bf \Sigma_n & =&\displaystyle \Sigma_0\left(\Sigma_0 + \frac{1}{n}\Sigma\right)^{-1}\frac{1}{n}\Sigma
\end{array}$$

I am looking for the derivation of this result with all the intermediate matrix algebra.

Any help is much appreciated.

Best Answer

With the distributions on our random vectors:

$\mathbf x_i | \mathbf \mu \sim N(\mu , \mathbf \Sigma)$

$\mathbf \mu \sim N(\mathbf \mu_0, \mathbf \Sigma_0)$

By Bayes's rule the posterior distribution looks like:

$p(\mu| \{\mathbf x_i\}) \propto p(\mu) \prod_{i=1}^N p(\mathbf x_i | \mu)$

So:

$\ln p(\mu| \{\mathbf x_i\}) = -\frac{1}{2}\sum_{i=1}^N(\mathbf x_i - \mu)'\mathbf \Sigma^{-1}(\mathbf x_i - \mu) -\frac{1}{2}(\mu - \mu_0)'\mathbf \Sigma_0^{-1}(\mu - \mu_0) + const$

$ = -\frac{1}{2} N \mu' \mathbf \Sigma^{-1} \mu + \sum_{i=1}^N \mu' \mathbf \Sigma^{-1} \mathbf x_i -\frac{1}{2} \mu' \mathbf \Sigma_0^{-1} \mu + \mu' \mathbf \Sigma_0^{-1} \mu_0 + const$

$ = -\frac{1}{2} \mu' (N \mathbf \Sigma^{-1} + \mathbf \Sigma_0^{-1}) \mu + \mu' (\mathbf \Sigma_0^{-1} \mu_0 + \mathbf \Sigma^{-1} \sum_{i=1}^N \mathbf x_i) + const$

$= -\frac{1}{2}(\mu - (N \mathbf \Sigma^{-1} + \mathbf \Sigma_0^{-1})^{-1}(\mathbf \Sigma_0^{-1} \mu_0 + \mathbf \Sigma^{-1} \sum_{i=1}^N \mathbf x_i))' (N \mathbf \Sigma^{-1} + \mathbf \Sigma_0^{-1}) (\mu - (N \mathbf \Sigma^{-1} + \mathbf \Sigma_0^{-1})^{-1}(\mathbf \Sigma_0^{-1} \mu_0 + \mathbf \Sigma^{-1} \sum_{i=1}^N \mathbf x_i)) + const$

Which is the log density of a Gaussian:

$\mu| \{\mathbf x_i\} \sim N((N \mathbf \Sigma^{-1} + \mathbf \Sigma_0^{-1})^{-1}(\mathbf \Sigma_0^{-1} \mu_0 + \mathbf \Sigma^{-1} \sum_{i=1}^N \mathbf x_i), (N \mathbf \Sigma^{-1} + \mathbf \Sigma_0^{-1})^{-1})$

Using the Woodbury identity on our expression for the covariance matrix:

$(N \mathbf \Sigma^{-1} + \mathbf \Sigma_0^{-1})^{-1} = \mathbf \Sigma(\frac{1}{N} \mathbf \Sigma + \mathbf \Sigma_0)^{-1} \frac{1}{N} \mathbf \Sigma_0$

Which provides the covariance matrix in the form the OP wanted. Using this expression (and its symmetry) further in the expression for the mean we have:

$\mathbf \Sigma(\frac{1}{N} \mathbf \Sigma + \mathbf \Sigma_0)^{-1} \frac{1}{N} \mathbf \Sigma_0 \mathbf \Sigma_0^{-1} \mu_0 + \frac{1}{N} \mathbf \Sigma_0(\frac{1}{N} \mathbf \Sigma + \mathbf \Sigma_0)^{-1} \mathbf \Sigma \mathbf \Sigma^{-1} \sum_{i=1}^N \mathbf x_i$

$= \mathbf \Sigma(\frac{1}{N} \mathbf \Sigma + \mathbf \Sigma_0)^{-1} \frac{1}{N} \mu_0 + \mathbf \Sigma_0(\frac{1}{N} \mathbf \Sigma + \mathbf \Sigma_0)^{-1} \sum_{i=1}^N (\frac{1}{N} \mathbf x_i)$

Which is the form required by the OP for the mean.

Related Question