[Math] How does the pointwise ergodic theorem generalize the strong law of large numbers

ergodic-theoryprobability theory

I've heard (e.g. here) that the pointwise ergodic theorem (PWET) generalizes the strong law of large numbers (SLLN). How exactly does the PWET generalize the SLLN? The PWET requires a measure-preserving (and ergodic for a stronger conclusion) transformation. What is the measure preserving (and maybe ergodic) transformation in the SLLN?

I've also heard it said (though I don't recall where) that the PWET's generalization of the the SLLN is essentially because the PWET only requires "independence in the limit", whereas the SLLN requires the sequence to be i.i.d. What is meant by "independence in the limit"?

Best Answer

Let $\mu$ be a probability distribution on $\mathbb R$. Consider the product space $\Omega := \mathbb R \times \mathbb R \times \dots$ with product measure $P = \mu \times \mu \times \dots\;$. Let $f : \Omega \to \mathbb R$ be the "first component" map, $f(x_1,x_2,x_3,\dots) = x_1$. Let $T : \Omega \to \Omega$ be the "left shift" map, $T(x_1,x_2,x_3,\dots) = (x_2,x_3,\dots)\;\;$. Then (1) $T$ is a measure-preserving transformation, and (2) on the sample space $\Omega$ with respect to the measure $P$, the sequence $X_n(\omega) = f(T^n(\omega))\;$ is an i.i.d. sequence of random variables with distribution $\mu$. The strong law of large numbers and the individual ergodic theorem both tell us about a.s. convergence of $$ \lim_{n\to\infty} \frac{1}{n} \sum_{j=1}^n X_j $$ subject to certain conditions.