Monte Carlo and Markov Chains

markov chainsprobability theory

From a rigorous perspective, why does Monte Carlo work? In physics, you often set up a Markov chain with transitional probabilities $p_{ab}$ ($a,b$ are possible states) such that it satisfies the "detailed balance condition", i.e., $\pi_a p_{ab} = \pi_b p_{ba}$ where $\pi_a$ the desired measure/distribution. I understand that the detailed balance guarantees that the desired measure $\pi$ is stationary with respect to the transition probabilities, $p_{ab}$, but why exactly does the $n$th step $X_n \rightarrow \pi$ in distribution?

Basically, what I'm asking is the asymptotic behavior of Markov chains and the relation to Monte Carlo. I would appreciate some references that go into the rigor of the formulation.

EDIT:
I realized that in practice, we generate Markov chains using a process called rejection sampling.

Best Answer

In a finite-state Markov chain with $\pi_ap_{ab} = \pi_bp_{ba}$ for every pair of states $a,b,$ it may be false that $X_n \Rightarrow \pi.$

Specifically, the requirement that $X_n \Rightarrow \pi$ is that (1) every recurrent state is aperiodic and (2) the Markov subchain of recurrent states is irreducible.

Your condition that $\pi_ap_{ab} = \pi_bp_{ba}$ seems to only show that $\pi$ is stationary and that every state is recurrent. That is, in order to have $X_n \Rightarrow \pi$ we'd still need conditions that the Markov chain is both aperiodic and irreducible. Intuitively, that it is irreducible is stating that there is a unique stationary distribution, and that it is aperiodic is that the Markov chain is "well-mixed".

For a couple simple examples, note that the Markov chains on states $1,2$ given by

  1. $p_{12} = p_{21} = 1$ satisfies your condition for the unique $\pi_1 = \pi_2 = \frac{1}{2},$ but $X_n \not\Rightarrow \pi$ for any $X_0 \neq \pi.$
  2. $p_{11} = p_{22} = 1$ satisfies your condition for $\pi_1 = t, \pi_2 = 1-t$ for any $0\leq t\leq 1,$ but again $X_n \not\Rightarrow \pi$ for any $X_0 \neq \pi$

Here, the first Markov chain wasn't aperiodic and the second wasn't irreducible.


As a sidenote: In physics, I wouldn't be too surprised if you ignore the "aperiodic" condition, since aperiodic chains are likely ubiquitous in nature while periodic chains are more "exotic." However, reducible Markov chains aren't exotic at all, so they should definitely be considered when answering this question of convergence.

Related Question