Markov chains: If $P$ is irreducible and $\pi$ any stationary distribution, then $\pi_i = \frac{1}{m_i}$.

markov chainsprobability

I am trying to understand a proof of the following claim:

If $P$ is irreducible and $\pi$ any stationary distribution, then $\pi_i = \frac{1}{m_i}$, where $m_i$ is the mean return time to state $i$.


Here's the proof:

Suppose $\pi$ is stationary for $P$, and let $X$ be a Markov chain with initial distribution $\pi$ and transition matrix $P$. Let $V_i(n)$ be the number of visits to state $i$ before time $n$. Then, by stationarity, $\mathbb{P}(X_n=i) = \pi_i$ for all $n$, and
$$
\frac{\mathbb{E}[V_i(n)]}{n} = \frac{1}{n} \sum_{r=0}^{n-1}\mathbb{E}[1\{X_r=i\}]
= \frac{1}{n} \sum_{r=0}^{n-1}\mathbb{P}(X_r=i)
= \pi_i
\tag{$1$} \label{ref1}
$$

From the ergodic theorem, for any $\epsilon$,
$$
\mathbb{P}\left(\left|\frac{V_i(n)}{n} – \frac{1}{m_i} \right| > \epsilon\right) < \epsilon \tag{$2$} \label{ref2}
$$

for large enough $n$ (since almost sure convergence implies convergence in probability).

But since $\frac{V_i(n)}{n}$ is bounded between $0$ and $1$, it follows from \eqref{ref2} that
$$
\boxed{\frac{\mathbb{E}[V_i(n)]}{n} \to \frac{1}{m_i} \quad \text{as} \quad n \to \infty}
$$

Comparing to \eqref{ref1}, we obtain $\pi_i = \frac{1}{m_i}$.


Why does the boxed statement follow from \eqref{ref2}?

Best Answer

You can read (2) as

$$\mathbb{P}\left(-\epsilon \le \frac{V_i(n)}{n}-\frac{1}{m_i} \le \epsilon\right ) > 1- \epsilon$$

and this means with probability $1-\epsilon<1$ the absolute difference is less than $\epsilon$ and with the other probability $\epsilon$ the absolute difference is less than $1$ (since $\frac{V_i}{n}$ is bounded between $0$ and $1$), so there is a probability at least $1−ϵ$ that $\frac{V_i(n)}{n}-\frac{1}{m_i}$ is between $−ϵ$ and $+ϵ$; otherwise it is between $−1$ and $+1$. So its expectation $\mathbb{E}\left[\frac{V_i(n)}{n}-\frac{1}{m_i}\right]=\frac{\mathbb{E}[V_i(n)]}{n}-\frac{1}{m_i}$ is bounded below by $(1−ϵ)×(−ϵ)+ϵ×(−1)>−2ϵ$ and bounded above by $(1−ϵ)×(+ϵ)+ϵ×(+1)<+2ϵ$, i.e.

$$-2\epsilon < \frac{\mathbb{E}[V_i(n)]}{n}-\frac{1}{m_i} < 2\epsilon$$ for large enough $n$ (which depends on $\epsilon$). You can then take $\epsilon$ as small as you like to conclude $$\frac{\mathbb{E}[V_i(n)]}{n} \to \frac{1}{m_i} \quad \text{as} \quad n \to \infty.$$