[Math] Almost sure convergence in strong law of large numbers.

law-of-large-numbersprobability theory

Strong Law of Large Numbers is often stated as
$$\overline{X}_n\ \xrightarrow{a.s.}\ \mu \qquad\textrm{when}\ n \to \infty$$
or
$$\Pr\!\left( \lim_{n\to\infty}\overline{X}_n = \mu \right) = 1$$
for $\overline{X}_n$ – average of $n$ i.i.d. random variables with mean $\mu$.

It seems to me from the definition, that order to have a notion of "almost sure convergence" we must have a sequence of random variables $X_i$ on the same probability space; at the same time $\overline{X}_n$ is a random variable on the product of probability spaces of the first $n$ of the $X_i$'s. Of course we can think of all $\overline{X}_k$'s as living on the $n$'th product for $k \leq n$, but to consider all $n$ at the same time, we would have to have an infinite product. This would make sense, but seems somewhat complicated (infinite product spaces!). So, am I missing something, or is this what's going on and all elementary introductions (and wikipedia) just supress this point?

Edit: Ok, so the original question was a bit misleading – and the clarification is a bit long, see comments below, but I decided to post it also as an answer (it overlaps a bit with the accepted one).

Best Answer

Independence concerns random variables defined on a common probability space. To see this, assume that $X:(\Omega,\mathcal F)\to(E,\mathcal E)$ and $Y:(\Psi,\mathcal G)\to(E,\mathcal E)$ are random variables. To show that $X$ and $Y$ are independent, one would consider events such as $$ [X\in B]\cap[Y\in C]=\{\omega\in\Omega\mid X(\omega)\in B\}\cap\{\psi\in\Psi\mid Y(\psi)\in C\}. $$ Unless $(\Omega,\mathcal F)=(\Psi,\mathcal G)$, this simply does not make sense.

...$\overline{X}_n$ is a random variable on the product of probability spaces of the first $n$ of the $X_i$'s...

Not at all. The random variable $\overline{X}_n$ can only be defined on the common probability space which every $X_n$ is defined on. To define sums $X+Y$ such as the ones every $\overline{X}_n$ requires, one considers $$X+Y:\omega\mapsto X(\omega)+Y(\omega). $$

Maybe one needs infinite product spaces to even talk about a sequence of i.i.d. Xi's

One does not, for the reasons above. If one insists on using a product space, the construction is as follows. Assume that $X_i:(\Omega_i,\mathcal F_i)\to(E,\mathcal E)$, consider $\Omega=\prod\limits_i\Omega_i$, $\mathcal F=\mathop{\otimes}_i\mathcal F_i$ and, for every $i$, the random variable $Z_i:(\Omega,\mathcal F)\to(E,\mathcal E)$ defined by $Z_i(\omega)=X_i(\omega_i)$ for every $\omega=(\omega_i)_i$ in $\Omega$. Then, if each $(\Omega_i,\mathcal F_i)$ is endowed with a probability $P_i$ such that the distribution $P_i\circ X_i^{-1}$ does not depend on $i$ and if $(\Omega,\mathcal F)$ is endowed with the probability $P=\mathop{\otimes}_iP_i$, then indeed $(Z_i)$ is i.i.d. with common distribution $$ P\circ Z_i^{-1}=P_i\circ X_i^{-1}. $$ One may find this kind of construction fascinating. Usually though, after a while, the feeling passes... :-) and one sticks to the modus operandi most probabilists adopt, which is to consider that the exact nature of $(\Omega,\mathcal F,P)$ is irrelevant and that all that counts are the image measures on the target space.

Related Question