I don’t see how the binomial theorem relates to the principle of inclusion and exclusion

binomial distributionbinomial theoremdiscrete mathematicsinclusion-exclusion

I'm learning discrete maths as a hobby at the moment and I got stuck when the tutor starting relating the binomial theorem to the principles of inclusion and exclusion. The video I was watching is https://www.youtube.com/watch?v=GS7dIWA6Hpo&list=PLDDGPdw7e6Aj0amDsYInT_8p6xTSTGEi2&index=6 and the question starts at (14.25).

I understand the binomial theorem, and I see how combinatorics is used to calculate the coefficients. I've seen multiple online tutorials where the binomial theorem has been used to describe the nature of inclusion and exclusion and I just don't understand.

Exclusion/Inclusion formula:

|A1 ∪ A2 ∪ A3| = |A1| + |A2| + |A3| − |A1 ∩ A2| − |A1 ∩ A3| − |A2 ∩ A3| + |A1 ∩ A2 ∩ A3|

This makes sense because we have to exclude the cases where elements are counted twice (drawing venn diagrams helped me understand this).

Binomial Theorem:

$(A+B)^n = \sum_{k=0}^n {n\choose{k}} A^{n-k} B^{k}$

This formula makes sense to me again, but can someone please explain it to me in simple terms how the binomial theorem is even related to inclusion/exclusion?

I've also seen proofs where examples substitute the x = 1 and y = -1 and we end up getting the binomial expansion to equal 0. I just don't see how we can relate that to PIE.

Please help.

EDIT:

This part of the video is what confuses me (17.27). I don't get the ((1) + (-1)) part, what is he trying to show from that?

Best Answer

The explanation goes back to the origins of Boolean algebra. Define a "Boolean" variable $\, x \,$ as one where $\, x^2 = x. \,$ For these variables $\, 1-x \,$ denotes logical negation and $\, xy \,$ denotes logical and. Finally, by De Morgan's laws, $\ 1-(1-x)(1-y) \,$ denotes logical or.

In the context of subsets of a univeral set $\, U, \,$ any subset $\, A \,$ of $\, U \,$ is identified by its indicator function $\, A(x) \,$ which is Boolean valued. Now the number of elements of $\, A \,$ is $\, |A| = \sum A(x) .\,$ Next, the indicator function of $\, A \cap B \,$ is $\, A(x)B(x), \,$ of complement of $\, A \,$ is $\, 1-A(x), \,$ and of $\, A \cup B \,$ is $\, 1-(1-A(x))(1-B(x)). \,$

Given any subsets $\, A_1, A_2, \dots, A_n \,$ and their indicator functions $\, A_1(x), A_2(x), \dots, A_n(x), \,$ we can write $\, 1-(1-A_1(x))(1-A_2(x)) = A_1(x) + A_2(x) - A_1(x)A_2(x) \,$ which implies $\, |A_1 \cap A_2| = |A_1| + |A_2| - |A_1 \cup A_2|. \,$ This obviously generalizes to any finite number of subsets and is one form of the PIE.

Now suppose that $\,A = A_1=A_2=\dots=A_n.\,$ Use the binomial theorem to get $$ (1-A(x))^n = \sum_{k=0}^n {n\choose{k}} (-1)^k A(x)^{k}$$ but since $\,A(x)\,$ and $\,1-A(x)\,$ are Boolean valued and if also $\,n>0\,$ this simplifies to $$ 1-A(x) = (1-A(x))^n = 1 + A(x)\sum_{k=1}^n {n\choose{k}} (-1)^k.$$ If now $\,A(x)\ne 0\,$ then this implies that $$ -1 = \sum_{k=1}^n {n\choose{k}} (-1)^k \qquad \text{ and } \qquad 0 = \sum_{k=0}^n {n\choose{k}} (-1)^k . $$