Intuition for Borel-Cantelli lemma and almost surely convergence

borel-cantelli-lemmasmeasure-theoryprobability theory

I have looked around the site to see if a similar question was asked. I couldn't find one, but please refer to one, if it happens I'm mistaken.

My question is with regard to some intuition about the Borel-Cantelli lemma and almost surely convergence (a.s.).

For example take the stochastic process $X_n=1_{[0,\frac{1}{n^2}]}$ for $n\in \mathbb{N}$. The book, that I'm reading, uses this definition for almost surely convergence: $P(X_n \rightarrow X)=1$, where $(X_n \rightarrow X)=(\forall \epsilon>0 \exists N\in \mathbb{N} \forall n\geq N: |X_n-X|<\epsilon)$. In the example we have that for every N we pick, we have that $P(|X_N-0|< \epsilon)=1-\frac{1}{N^2}\neq 1$ (for $\epsilon<1$). Hence according to the definition we don't have that $X_n$ convergences a.s. to 0 (I know that this isn't correct).

By the Borel-Cantelli lemma we have that for $\epsilon<1$ that $\sum_{n=1}^{\infty} P(|X_n-0|\geq \epsilon)=\sum_{n=1}^{\infty} \frac{1}{n^2}<\infty \Rightarrow P(|X_n-0|\geq \epsilon \, i.o.)=0$. Hence $X_n$ convergences a.s. to 0, since $|X_n-0|\geq \epsilon$ only happens a finite amount of times. Therefore, there exists an N such that for $n\geq N$ we have $P(|X_n-0|\leq \epsilon)=1$.

I can't quite understand the intuition in this example. Why can we say that $X_n$ converges a.s. to 0, when we can't pick a specific N? Doesn't that go against the definition of a.s. convergence? If I'm not wrong, then the Borel-Cantelli lemma says that there exists an N, but in this example then no matter what N we pick, it won't satiesfy the a.s. convergence definition. I just think it's a bit mind boggling. Most of all I'm just curios if this can be explained in a nice fashion. If the example is trivial or wrong, I'm sorry.

Best Answer

So to write your definition more precisely, I think that the book is saying the following:

Let $X_n$ be a sequence of random variables, and $X$ be another random variable on the same probability space $(\Omega, \mathcal F, \mathbb P)$. Let

$A = \{\omega \in \Omega : \text{ for all }\epsilon > 0, \text{ there exists } N(\omega, \epsilon) \in \mathbb N \text{ such that } n \geq N \Rightarrow \left|X(\omega) - X_n(\omega)\right| < \epsilon\}$.

Then we say that $X_n$ converges almost surely to $X$ if $\mathbb P[A] = 1$.

This is equivalent to the following statement: $X_n$ is said to converge to $X$ almost surely if there exists a set $N \in \mathcal F$ with $\mathbb P[N] = 0$ such that for all $\omega \in \Omega \backslash N$, and for all $\epsilon > 0$, there exists some $N(\omega, \epsilon)\in \mathbb N$ such that $n \geq N \Rightarrow \left|X(\omega) - X_n(\omega)\right| < \epsilon$.

The problem with your statement in your example is that you've gotten the order mixed up. In the definition, $N$ depends both on $\omega$ and $\epsilon$. It is true that $\mathbb P(\{|X_N - 0| : \omega \in \Omega\}) < 1$, but this $N$ does not depend on $\omega$, which it needs to.

A quick bit about intuition: $X_n$ converges almost surely to $X$ if it converges pointwise everywhere except on a set of measure 0. This convergence need not be uniform. Just like a sequence of real-valued functions $f_n:\mathbb R \rightarrow \mathbb R$ might converge for every $x \in \mathbb R$, it may require a different $N$ for each $x$. Similarly, if $X_n$ converges almost surely to $X$, you can still have that the convergence is not uniform -- i.e. for different $\omega$'s, you need a different $N$.

Hope this helps.