Solved – Posterior autocorrelation in Pymc. How to interpret it

bayesianprobabilistic-programmingpymc

I started learning Bayesian inference by reading "Probabilistic Programming and Bayesian Methods for Hackers". I found something that is not really clear for me in the third chapter. Lets look at the small part from the original text.

A chain that is [Isn't meandering exploring?] exploring the space well will exhibit very high autocorrelation. Visually, if the trace seems to meander like a river, and not settle down, the chain will have high autocorrelation.
This does not imply that a converged MCMC has low autocorrelation. Hence low autocorrelation is not necessary for convergence, but it is sufficient. PyMC has a built-in autocorrelation plotting function in the Matplot module.

If autocorrelation is low, it's sufficient condition for good convergence. But if autocorrelation is high, what can we say about the convergence?

If autocorrelation is low, does it mean that parameter's space was explored bad? How can we improve the sampler to converge and to explore the full parameters space at the same time?

Best Answer

  1. Autocorrelation dictates the amount of time you have to wait for convergence. If autocorrelation is high, you will have to use a longer burn-in, and you will have to draw more samples after the burn-in to get a good estimate of the posterior distribution.
  2. Low autocorrelation means good exploration. Exploration and convergence are essentially the same thing. If a sampler explores well, then it will converge fast, and vice versa. Part of the confusion here is that the book does not explain the concept of convergence very well. You can find a much better definition of convergence on Wikipedia.
Related Question