Why Not Fewer Axioms for Probability Theory

combinatoricsdiscrete mathematicsprobabilityprobability theory

From what I understand, modern probability theory assumes the following axioms:

  1. $0 \le P(E) \le 1$.
  2. $P(S) = 1$.
  3. $P(\bigcup_{i=1}^\infty E_i) = \sum_{i=1}^\infty P(E_i)$ where $E_i \cap E_j=\emptyset$ for $1 \le i < j$.

Clearly, the inclusion–exclusion principle reduces to axiom 3 whenever events are mutually exclusive. Moreover, the inclusion–exclusion principle may be proven by induction, without any help from axiom 3 whatsoever. Why then is axiom 3 even an axiom at all?

Best Answer

Contra your claim, axioms $(1)$ and $(2)$ alone are extremely weak. For example, taking $S=[0,1]$ for concreteness, let $P_*(X)=1$ iff $0\in X$ or $1\not\in X$, and $P(X)=0$ otherwise. This $P_*$ is absolutely horrible: beyond merely failing to satisfy $(3)$, it's not even monotonic, since e.g. $P_*([{1\over 2}, 1))=1$ but $P_*([{1\over 2}, 1])=0$. For that matter, it also has $P_*(\emptyset)=1$.

Related Question