Why is the number of subsets in a sample space $2^n$

combinatoricsself-studyset-theorysubset

I am working my way through the statistical inference textbook and am stuck on this question:

  • Suppose that a sample space $S$ has $n$ elements. Prove that the
    number of subsets that can be formed from the elements of S is $2^n$

Taking examples for $n=2$:
$$n=2;\{\{\emptyset \}, \{a\}, \{b\}, \{a,b\}\};$$
And for $n=3$
$$n=3;\{\{\emptyset \}, \{a\}, \{b\}, \{c\}, \{a,b\}, \{a,c\}, \{b,c\}, \{a,b,c\}\}$$

I do understand that any subset needs to be constructed by including and excluding elements of $n$ but I am not sure how to prove it. Just showing a few examples seems to not be enough.

What would be a formal proof of this simple fact?

Best Answer

You ask for a formal proof. That requires definitions and a simple construction. I offer these because they are generally useful in statistics.

The following is merely a formal way of explaining that any binary vector determines a subset (from the positions of its $1$s) and, conversely, any set of positions determines a binary vector.


First, the construction, because it's the crux of the matter. When $S$ is a set and $A\subseteq S,$ the indicator function $\mathscr I_{A;S}$ is the relation

$$ \mathscr I_{A;S} = \{(x,0)\mid x\in S,\ x \notin A\} \cup \{(x,1)\mid x \in A\}.$$

The set of all indicator functions with domain $S$ doesn't have any conventional name, so here let's call it $\mathcal{I}(S).$ Because a function on any set $S$ is a vector, you can think intuitively of $\mathcal I(S)$ as the set of binary vectors indexed by $S.$

As a matter of notation, for any set $S$, write $|S|$ for its cardinality. (All you need to know about cardinality is that two sets $S$ and $T$ have the same cardinality if and only if there exist injective functions from $S$ to $T$ and from $T$ to $S.$)

  1. The definition of powers of $2$ is $|\mathcal{I}(S)| = 2^{|S|}.$ That is, when $n$ is any cardinal, $2^n$ is the cardinality of the set of indicator functions on any set $S$ of cardinality $n.$ (Intuitively, $2^n$ counts how many binary numbers with $n$ digits there are -- even when $n$ is zero or infinite!)

  2. It is an axiom of set theory that the collection of all subsets of any set $S$ -- its "power set" $\mathscr P(S)$ -- is a set.

The question asks to show that $\mathcal{I}(S)$ and $\mathscr P(S)$ have the same cardinality for any finite set $S.$ We won't require finiteness in this proof.

  1. Let $\sigma: \mathscr{P}(S) \to \mathcal I(S)$ map any subset of $S$ to its indicator function.

  2. Let $\tau: \mathcal I(S) \to \mathscr P(S)$ assign to any indicator function $\mathscr I$ the set $\{x\in S\mid \mathscr I(x)=1\}.$ That is, the $1$'s in a binary vector determine a subset.

It is immediate (from the initial construction and definitions) that the compositions $\sigma\circ \tau$ and $\tau\circ\sigma$ are the identity functions on $\mathcal I(S)$ and $\mathscr P(S),$ respectively. Therefore both $\sigma$ and $\tau$ are bijective, whence $|\mathcal I(S)| = |\mathscr P(S)|,$ QED.

In statistical parlance, certain subsets $A$ of a probability space $S$ are events while their indicator functions $\mathscr I_{A;S}$ are random variables. This demonstration, as embodied in the maps $\sigma$ and $\tau,$ shows how we can freely move back and forth between the two concepts as needed.


In light of other attempts in this thread to display proofs that do not rely (explicitly) on induction, notice that use of mathematical induction cannot possibly be lurking, because such a result would apply at best to ordinal numbers, not all cardinals.

Related Question