Solved – Proof for probability of union of n events

probabilityproof

I'm trying to refresh my knowledge of probability so I'm working my way through Haigh's Probability Models 2e. I'm looking at one of the corollaries presented (1.3) and I don't understand it. The corollary states:
$$
P\left(\bigcup _{i=1}^n A_i\right) = S_1 – S_2 + S_3 – … +(-1)^{n+1}S_n
$$
where
$$
S_1 = \sum _{i=1}^nP(A_i), \hspace{8pt} S_2 = \sum _{i<j}P(A_i \cap A_j), \hspace{8pt} S_n = P\left(\bigcap _{i=1}^nA_i\right)
$$
So for the initial step ($n=2$) I should get the following:
$$
P(A_1 \cup A_2) = P(A_1) + P(A_2) – P(A_1 \cap A_2)
$$
which works using $S_1$ and $S_2$ above. However, (this is the confusing part for me) $S_n$ for $n=1$ gives me $S_1 = P\big(\bigcap _{i=1}^1A_i\big) = P(A_1)$ when I should get $S_1 = P(A_1) + P(A_2)$. Does $S_n$ just not apply to $S_1$ and $S_2$ (i.e. $S_n$ = … for $n > 2$)? I don't think that's mathematically sound. What gives?

Best Answer

What you are describing is the inclusion-exclusion principle in probability. $S_k$ is sum of the probability of all k-cardinality intersections among your sets. As a worked example, in the $n=4$ case, you would have:

\begin{align*} S_1 &= P(A_1) + P(A_2) + P(A_3) + P(A_4) \\ S_2 &= P(A_1\cap A_2) + P(A_1\cap A_3) + P(A_1\cap A_4) + P(A_2\cap A_3) + P(A_2\cap A_4) + P(A_3\cap A_4) \\ S_3 &= P(A_1\cap A_2\cap A_3) + P(A_1\cap A_2\cap A_4) + P(A_1\cap A_3\cap A_4) + P(A_2\cap A_3\cap A_4) \\ S_4 &= P(A_1\cap A_2\cap A_3\cap A_4) \end{align*}

It is a bit notationally tricky to write out $S_k$ for arbitrary $k$, which is probably why the book only chose to write out $S_1$, $S_2$, and $S_n$. However you could write the following in general (the second one, which uses an index set, is how the Wikipedia article defines $S_k$):

\begin{align*} S_k &= \sum_{1\leq i_1<i_2<\ldots<i_k\leq n} P\big(\bigcap_{j=1}^k A_{i_j}\big) \\ &= \sum_{\substack{I\subset\{1, 2, \ldots, n\}\\|I| = k}} P\big(\bigcap_{i\in I} A_i\big) \end{align*}

The proof of the result relies on the identity $P(A\cup B) = P(A) + P(B) - P(A\cap B)$ applied inductively:

\begin{align*} P(\cup_{i=1}^n A_i) &= P((\cup_{i=1}^{n-1} A_i)\cup A_n) \\ &= P(\cup_{i=1}^{n-1} A_i) + P(A_n) - P((\cup_{i=1}^{n-1} A_i)\cap A_n) \\ &= P(\cup_{i=1}^{n-1} A_i) + P(A_n) - P(\cup_{i=1}^{n-1} (A_i\cap A_n)) \\ &= \sum_{k=1}^{n-1}(-1)^{k+1} \sum_{1\leq i_1<i_2<\ldots<i_k\leq n-1} P(\cap_{j=1}^k A_{i_j}) + P(A_n) -\\ & ~~~~~\sum_{k=1}^{n-1}(-1)^{k+1} \sum_{1\leq i_1<i_2<\ldots<i_k\leq n-1} P(\cap_{j=1}^k A_{i_j}\cap A_n) \\ &= \sum_{k=1}^{n}(-1)^{k+1} \sum_{1\leq i_1<i_2<\ldots<i_k\leq n} P(\cap_{j=1}^k A_{i_j}) \end{align*}

The last step requires pairing the $k=1$ case of the first summation with $P(A_n)$ and then pairing $k=2$ from the first summation with $k=1$ from the second summation, pairing $k=3$ from the first summation with $k=2$ from the second summation, and so on.

Related Question