Solved – Can MCMC iterations after burn in be used for density estimation

asymptoticsdistributionsmarkov-chain-montecarlo

After burn-in, can we directly use the MCMC iterations for density estimation, such as by plotting a histogram, or kernel density estimation?
My concern is that the MCMC iterations are not necessarily independent, although they are at most identically distributed.

What if we further apply thinning to the MCMC iterations? My concern is that the MCMC iterations are at most uncorrelated, and not yet independent.

The ground I learned for using an empirical distribution function as an estimation of the true distribution function is based on Glivenko–Cantelli theorem, where the empirical distribution function is calculated based on an iid sample. I seemed to see some grounds (asymptotic results?) for using histograms, or kernel density estimates as density estimations, but I can't recall them.

Best Answer

You can - and people do - estimate densities from MCMC sampling.

One thing to keep in mind is that while histograms and KDEs are convenient, at least in simple cases (such as Gibbs sampling), much more efficient estimates of density may be available.

If we consider Gibbs sampling in particular, the conditional density you're sampling from can be used in place of the sample value itself in producing an averaged estimate of the density. The result tends to be quite smooth.

The approach is discussed in

Gelfand and Smith (1990), "Sampling-Based Approaches to Calculating Marginal Densities"
Journal of the American Statistical Association, Vol. 85, No. 410, pp. 398-409

(though Geyer cautions that if the sampler dependence is high enough it doesn't always reduce the variance and gives conditions for it to do so)

This approach is also discussed, for example, in Robert, C. P. and Casella, G. (1999) Monte Carlo Statistical Methods.

You don't need independence, you're actually computing an average. If you want to compute a standard error of a density estimate (or a cdf), then you have to account for the dependence.

The same notion applies to other expectations, of course, and so it can be used to improve estimates of many other kinds of average.

Related Question