Markov Chain Monte Carlo – Determining the Number of Markov Chain Monte Carlo Samples Needed

markov-chain-montecarlomarkov-processmonte carlosample-size

There is a lot of literature out there about Markov chain Monte Carlo (MCMC) convergence diagnostics, including the most popular Gelman-Rubin diagnostic. However, all of these assess the convergence of the Markov chain, and thus address the question of burn-in.

Once I have figured out burn-in, how should I decide how many MCMC samples are enough to continue with my estimation process? Most of the papers using MCMC mention that they ran the Markov chain for some $n$ iterations, but do not say anything about why/how they chose that number, $n$.

In addition, one desired sample size cannot be the answer for all samplers, since the correlation in the Markov chain varies greatly from problem to problem. So is there a rule out there to find out the number of samples required?

Best Answer

How many samples (post burn-in) that you need depends on what you are trying to do with those samples and how your chain mixes.

Typically we are interested in posterior expectations (or quantiles) and we approximate these expectations by averages of our posterior samples, i.e. $$ E[h(\theta)|y] \approx \frac{1}{M} \sum_{m=1}^M h(\theta^{(m)}) = E_M $$ where $\theta^{(m)}$ are samples from your MCMC. By the Law of Large Numbers the MCMC estimate converges almost surely to the desired expectation.

But to address the question of how many samples we need to be assured that we are close enough to the desired expectation, we need a Central Limit Theorem (CLT) result, i.e. something like $$ \frac{E_M -E[h(\theta)|y]}{\sqrt{M}} \stackrel{d}{\to} N(0,v_h^2) $$ With this CLT we could then make probabilitic statements such as "there is 95% probability that $E[h(\theta)|y]$ is between $E_M \pm 1.96 v_h$." The two issues here are

  1. Does the CLT apply?
  2. How can we estimate $v_h^2$.

We have some results on when the CLT applies, e.g. discete state chains, reversible chains, geometrically ergodic chains. See Robert and Casella (2nd ed) section 6.7.2 for some results in this direction. Unfortunately the vast majority of Markov chains that are out there have no proof that CLT exists.

If a CLT does exist, we still need to estimate the variance in the CLT. One way to estimate this variance involves breaking the chain up into blocks, see Gong and Flegal and references therein. The method has been implemented in the R package mcmcse with the functions mcse and mcse.q to estimate the variance for expectations and quantiles.

Related Question