Proof of the continuity axiom in the classical probability model

I am reading a "proof" that in the classical probability model, the probability axioms of Kolmogorov are satisfied. I say "proof" because there's a serious flaw there. So I need some clarification i.e. a real rigorous proof (for one of the axioms).

Definition: Let $(\Omega, F)$ be a measurable space, where $\Omega = \{w_1, w_2, w_3, …\}$ is a countable infinite set, and let $F$ be the powerset of $\Omega$. Let us assume that every elementary event/outcome $w_i$ is mapped to a non-negative number $p(w_i)$ and also let us assume

$\sum_{i=1}^n p(w_i) = 1$

(that, I think, means the RHS series is convergent and its sum is 1).

Then for every event $A \subseteq \Omega$ we define the probability of $A$ as

$P(A) = \sum_{w \in A} p(w)$

OK… now having this definition, we need to prove the following axiom is satisfied.

A4: For every sequence of events $A_1 \supseteq A_2 \supseteq A_3 \supseteq …$ such that $$\bigcap_{i=1}^\infty{A_n} = \emptyset$$
the respective sequence of probabilities $P(A_1), P(A_2), P(A_3), …$ is decreasing and goes to zero as $n \to \infty$ .

Of course proving that the sequence of probabilities $P(A_1), P(A_2), P(A_3), …$ is decreasing is not a problem.

But regarding the limit being zero, I looked in several books, I also searched online. I don't find a decent proof of the fact that the sequence goes to $0$ as $n \to \infty$. My book basically states that this is obvious because in the series $$P(A_n) = \sum_{w \in A_n} p(w)$$ "we run out of terms" as $n \to \infty$. But that's not really a proof, is it? It's just some intuition-based note. So how do we prove that this axiom A4 is satisfied?

Note 1: It seems to me that's actually a real analysis, in particular a series problem but also related to set theory. Somehow I feel like $P(A_n)$ is the remainder term in the series defining $P(A_1)$ which is a convergent series. So $P(A_n)$ must go to zero. But I cannot really formalize this argument, I get confused in my thoughts. For this argument to work, it seems we need to order somehow the elements of $A_1$ by first taking those elements which don't belong to $A_2$, then those which don't belong to $A_3$, then those which don't belong to $A_4$ and so on. And then it feels like $P(A_n)$ is somehow that remainder term of the series $P(A_1) = \sum_{w \in A_1} p(w)$. But as I said, I can't really formalize my intuition.

Note 2: Now I am thinking that my major confusion stems from the fact I am not even sure what is the n-th partial sum of this series $$\sum_{w \in A} p(w)$$ E.g. if $A = \{w_1, w_5, w_7, w_{90}, w_{100}, \dots \}$, is the 5-th partial sum $w_1 + w_5 + w_7 + w_{90} + w_{100}$, or is it $w_1 + 0 + 0 + 0 + w_5$? I think we need to work with the 2nd interpretation when proving that the axioms are satisfied (all axioms, not just A4 which I quoted above). If I use the 1st interpretation (of the partial sum), it's not quite clear how to prove the additivity axiom $P(A \cup B) = P(A) + P(B)$, when $AB = \emptyset$. And we must use the additivity axiom to prove A4.
Also, it's not clear what is $P(B)$ if $B$ is finite.

Best Answer

Let $B_n = A_n \setminus A_{n+1}$ and $C_n = A_1 \setminus A_n = \bigcup_{k=1}^{n-1} B_n$ for all $n \in \mathbb N$.

Clearly the sets $B_1, B_2, B_3 …$ are all mutually disjoint while $C_1 \subseteq C_2 \subseteq C_3 \subseteq …$ form an ascending chain, with $\bigcup_{n=1}^\infty B_n = \bigcup_{n=1}^\infty C_n = A_1$. Further, since $C_n = A_1 \setminus A_n$, we have $P(C_n) = P(A_1) - P(A_n)$.

We wish to show that $$\lim_{n\to\infty} P(A_n) = \lim_{n\to\infty} P(A_1) - P(C_n) = 0.$$

To do that, it suffices to observe that $$ \lim_{n\to\infty} P(C_n) = \lim_{n\to\infty} \sum_{k=1}^{n-1} P(B_n) = \sum_{n=1}^\infty P(B_n) = \sum_{n=1}^\infty \sum_{\omega \in B_n} p(\omega) = \sum_{\omega \in \bigcup_{n=1}^\infty B_n} p(\omega) = \sum_{\omega \in A_1} p(\omega) = P(A_1). $$

Basically, we're splitting the initial event $A_1$ into a disjoint union of events $B_1, B_2, B_3, …$, where the event $B_n$ contains exactly those outcomes $\omega \in \Omega$ that are removed from the descending chain of events $A_1 \supseteq A_2 \supseteq A_3 \supseteq …$ at the $n$-th step. We then observe that first adding up the probabilities of the outcomes in $B_1$, then those in $B_2$, etc. is equivalent to adding up all the outcomes in $A_1 = B_1 \cup B_2 \cup B_3 \cup …$; either way, we end up counting each outcome exactly once. Thus, conversely, as we first remove the outcomes in $B_1$ from the sum, then those in $B_2$, etc., we'll eventually end up removing every outcome from the sum, and are thus left with a limit probability of zero.

Best Answer

Related Solutions

[Math] Continuity of Probability Measure and monotonicity

Related Question