Bayesian Updating – How to Update Bayesian Models with New Data

bayesianconjugate-priornormal distribution

How do we go about calculating a posterior with a prior N~(a, b) after observing n data points? I assume that we have to calculate the sample mean and variance of the data points and do some sort of calculation that combines the posterior with the prior, but I'm not quite sure what the combination formula looks like.

Best Answer

The basic idea of Bayesian updating is that given some data $X$ and prior over parameter of interest $\theta$, where the relation between data and parameter is described using likelihood function, you use Bayes theorem to obtain posterior

$$ p(\theta \mid X) \propto p(X \mid \theta) \, p(\theta) $$

This can be done sequentially, where after seeing first data point $x_1$ prior $\theta$ becomes updated to posterior $\theta'$, next you can take second data point $x_2$ and use posterior obtained before $\theta'$ as your prior, to update it once again etc.

Let me give you an example. Imagine that you want to estimate mean $\mu$ of normal distribution and $\sigma^2$ is known to you. In such case we can use normal-normal model. We assume normal prior for $\mu$ with hyperparameters $\mu_0,\sigma_0^2:$

\begin{align} X\mid\mu &\sim \mathrm{Normal}(\mu,\ \sigma^2) \\ \mu &\sim \mathrm{Normal}(\mu_0,\ \sigma_0^2) \end{align}

Since normal distribution is a conjugate prior for $\mu$ of normal distribution, we have closed-form solution to update the prior

\begin{align} E(\mu' \mid x) &= \frac{\sigma^2\mu + \sigma^2_0 x}{\sigma^2 + \sigma^2_0} \\[7pt] \mathrm{Var}(\mu' \mid x) &= \frac{\sigma^2 \sigma^2_0}{\sigma^2 + \sigma^2_0} \end{align}

Unfortunately, such simple closed-form solutions are not available for more sophisticated problems and you have to rely on optimization algorithms (for point estimates using maximum a posteriori approach), or MCMC simulation.

Below you can see data example:

n <- 1000
set.seed(123)
x     <- rnorm(n, 1.4, 2.7)
mu    <- numeric(n)
sigma <- numeric(n)

mu[1]    <- (10000*x[i] + (2.7^2)*0)/(10000+2.7^2)
sigma[1] <- (10000*2.7^2)/(10000+2.7^2)
for (i in 2:n) {
  mu[i]    <- ( sigma[i-1]*x[i] + (2.7^2)*mu[i-1] )/(sigma[i-1]+2.7^2)
  sigma[i] <- ( sigma[i-1]*2.7^2                  )/(sigma[i-1]+2.7^2)
}

If you plot the results, you'll see how posterior approaches the estimated value (it's true value is marked by red line) as new data is accumulated.

Updating prior in subsequent steps in normal-normal model

For learning more you can check those slides and Conjugate Bayesian analysis of the Gaussian distribution paper by Kevin P. Murphy. Check also Do Bayesian priors become irrelevant with large sample size? You can also check those notes and this blog entry for accessible step-by-step introduction to Bayesian inference.

Related Question