Solved – Is Markov chain based sampling the “best” for Monte Carlo sampling? Are there alternative schemes available

markov chainmarkov-chain-montecarlomonte carlosamplingstochastic-approximation

Markov Chain Monte Carlo is a method based on Markov chains that allows us to obtain samples (in a Monte Carlo setting) from non-standard distributions from which we cannot draw samples directly.

My question is why Markov chain is "state-of-the-art" for Monte Carlo sampling. An alternative question might be, are there any other ways like Markov chains that can be used for Monte Carlo sampling? I know (al least from looking at the literature) that the MCMC has deep theoretical roots (in terms of conditions like (a)periodicity, homogeneity, and detailed balance) but wondering if there are any "comparable" probabilistic models/methods for Monte Carlo sampling similar to Markov chains.

Please guide me if I have confused some part of the question (or if it seems confusing altogether).

Best Answer

There is no reason to state that MCMC sampling is the "best" Monte Carlo method! Usually, it is on the opposite worse than iid sampling, at least in terms of variance of the resulting Monte Carlo estimators$$\frac{1}{T}\sum_{t=1}^T h(X_t)$$Indeed, while this average converges to the expectation $\mathbb{E}_{\pi}[h(X)]$ when $\pi$ is the stationary and limiting distribution of the Markov chain $(X_t)_t$, there are at least two drawbacks in using MCMC methods:

  1. The chain needs to "reach stationarity", meaning that it needs to forget about its starting value $X_0$. In other words, $t$ must be "large enough" for $X_t$ to be distributed from $\pi$. Sometimes "large enough" may exceed by several orders of magnitude the computing budget for the experiment.
  2. The values $X_t$ are correlated, leading to an asymptotic variance that involves $$\text{var}_\pi(X)+2\sum_{t=1}^\infty\text{cov}_\pi(X_0,X_t)$$ which generally exceeds $\text{var}_\pi(X)$ and hence requires longer simulations than for an iid sample.

This being said, MCMC is very useful for handling settings where regular iid sampling is impossible or too costly and where importance sampling is quite difficult to calibrate, in particular because of the dimension of the random variable to be simulated.

However, sequential Monte Carlo methods like particle filters may be more appropriate in dynamical models, where the data comes by bursts that need immediate attention and may even vanish (i.e., cannot be stored) after a short while.

In conclusion, MCMC is a very useful (and very much used) tool to handle complex settings where regular Monte Carlo solutions fail.

Related Question