Probability of choosing disjoint sets from $\{1,2,…,n\}$

probability

The set $S$ is the set of all integers from $1$ to $n$. The sets $A_1, A_2, . . ., A_m$, which are not necessarily distinct, are chosen randomly and
independently from $\mathcal{P}(S)$, and for each $k$ $(1 \le k \le m)$, the set $A_k$ is equally likely to be any of the sets in $\mathcal{P}(S)$.

(a) By considering each integer separately, show that $P(A_1 ∩ A_2 = ∅) = (\frac{3}{4})^n$
and find $P(A_1 ∩ A_2 ∩ A_3 = ∅)$ and $P(A_1 ∩ A_2 ∩ · · · ∩ A_m = ∅)$

(b) Find $P(A_1 ⊆ A_2) , P(A_1 ⊆ A_2 ⊆ A_3)$ and $P(A_1 ⊆ A_2 ⊆ · · · ⊆ A_m)$ .

I think the question I want to ask is "Why does independence not guarantee $P(A_1\cap A_2\cap A_3)=P(A_k\cap A_3)$ and $P(A_1\subseteq A_2\subseteq A_3)=P(A_1\subseteq A_2)P(A_2\subseteq A_3)$?" but I'll show my method to this question and the reason it is wrong might be different, I'm not sure. I already have model solutions to this question, so knowing the answer is not my interest here.

For $a)$, it is not too difficult to show the first result considering $\sum_{r=1}^n 1-P(r\in A_1)P(r\in A_2)$. I wanted to generalise this by saying that $A_1\cap A_2$ is an arbitrary set in $\mathcal{P}(S)$, say $A_k$, so $P(A_1\cap A_2\cap A_3=\emptyset)=P(A_k\cap A_3=\emptyset)=\left(\frac{3}{4}\right)^n$,but this is wrong and also feels unsettling to have the probability unchanged and I can't see why the method isn't valid. Of course, it is not much a leap further to say $A_1\cap A_2\cap A_3$ is an arbitrary set in $\mathcal{P}(S)$ instead giving the probability to be $\frac{1}{2^n}$!

For $b)$, I tried to use $a)$. $P(A_1\subseteq A_2)=P(A_1^c\cap A_2=\emptyset)=\left(\frac{3}{4}\right)^n$ and this works in this case but if we try use it again $P(A_1\subseteq A_2 \subseteq A_3)=P(A_1\subseteq A_2)P(A_2\subseteq A_3)=\left(\frac{3}{4}\right)^{2n}$, it does not give the correct answer, I presume the first equality is where it goes wrong but I don't know why.

There is discussion of this question here but I don't think it is related.

Best Answer

As AnilCh indicated in a comment following my answer, while my answer solves the underlying math questions, it does not resolve the OP's specific questions. I have therefore added an Addendum to respond to these questions.

Part (a)

Note that the sets $A_1, \cdots, A_m$ are selected with replacement from $\mathcal{P}(S)$. For any random set $B \in \mathcal{P}(S)$, and any random element $k \in \{1,2, \cdots, n\}$, the chance that $k \in B$ is $(1/2).$

Given any two (not necessarily) distinct sets $A_1, A_2$ from $\mathcal{P}(S)$, the chance that a random element $k$ is not in both sets is $1 - (1/2)^2 = (3/4)$. Further, whether one element $k_1$ is in a set does not affect whether a different element $k_2$ is in the same set, where the set is taken at random from $\mathcal{P}(S)$.

This means that the event that $k_1$ is missing from at least one of a group of sets is an independent event with respect to whether $k_2$ is missing from at least one of the same group of sets.

The chance that a specific element $k$ is not in all of $A_1, \cdots, A_m$ is $\frac{2^m - 1}{2^m}$. Therefore, for the $m$ sets to have the empty set as their intersection, all $n$ elements must be such that none of them are in all of $A_1, \cdots, A_m$.

This means that the chance that the intersection of $A_1, \cdots, A_m$ is the empty set is $$\left(\frac{2^m - 1}{2^m}\right)^n.$$

Part (b)

In order to have $A_1 \subseteq A_2$, you must have that for each element $k \in \{1,2,\cdots, n\}$, it is not the case that $k \in A_1$ and simultaneously $k \not\in A_2$.

The chance that a specific element $k$ satisfies the above constraint is $(3/4)$. Therefore, the chance that $A_1 \subseteq A_2$ is $(3/4)^n$.

To consider the chance of $A_1 \subseteq A_2 \subseteq A_3$, you again go element by element. Consider the following truth table, based on the issue of which sets contain the element $k$.

\begin{array}{| r | r | r | r |} \hline k \in A_1 & k \in A_2 & k \in A_3 & A_1 \subseteq A_2 \subseteq A_3\\ \hline T & T & T & T\\ \hline T & T & F & F\\ \hline T & F & T & F\\ \hline T & F & F & F\\ \hline F & T & T & T\\ \hline F & T & F & F\\ \hline F & F & T & T \\ \hline F & F & F & T\\ \hline \end{array}

The above chart signifies that the chance that $A_1 \subseteq A_2 \subseteq A_3$, when attention is confined to a specific element $k$ is $(4/8) = (1/2)$.

Therefore, the chance that (with respect to all $n$ elements), $A_1 \subseteq A_2 \subseteq A_3 = (1/2)^n$.

Determining the specific chance that $A_1 \subseteq \cdots \subseteq A_m$ with respect to a specific element $k$ will take some work.

Suppose that with respect to a specific element $k$, the chance is $r \in (0,1)$, for $L$ sets, where $L \in \{1,2,\cdots, (m-1)\}.$ Then, when you add the set $A_{L+1}$, (1/2) the time, $k \in A_{L+1}$, and $(1/2)$ the time, $k \not\in A_{L+1}$.

The $(1/2)$ the time that $k \in A_{L+1}$, the probability $r$ pertains. The $(1/2)$ the time that $k \not\in A_{L+1}$, $k$ will have to also be missing from all of the other sets.

Therefore, with respect to a single element $k$, if the probability of $A_1 \subseteq \cdots \subseteq A_L = r$, then the probability of $A_1 \subseteq \cdots \subseteq A_L \subseteq A_{L+1}$ is equal to

$$\left[\frac{1}{2} \times r\right] + \left[\frac{1}{2} \times \frac{1}{2^L}\right] = \left[\frac{1}{2} \times r\right] + \frac{1}{2^{L+1}}.$$

What is needed is a closed form expression for the probability of $A_1 \subseteq \cdots \subseteq A_m$, as a function of $m$, when attention is confined to only one specific element $k$. Consider the following table.

\begin{array}{| r | r |} \hline m & p\{A_1 \subseteq \cdots \subseteq A_m\} \\ \hline 2 & \displaystyle \frac{3}{4} \\ \hline 3 & \displaystyle \frac{4}{8} \\ \hline 4 & \displaystyle \frac{5}{16} \\ \hline \end{array}

So, the closed form that represents the above table is $$\frac{m+1}{2^m}.$$

Therefore, since you have to focus on the $n$ independent events that represent each of the $n$ elements, the chance that $$A_1 \subseteq \cdots \subseteq A_m = \left(\frac{m+1}{2^m}\right)^n.$$

Addendum

I think the question I want to ask is "Why does independence not guarantee $P(A_1\cap A_2\cap A_3)=P(A_k\cap A_3)$ and $P(A_1\subseteq A_2\subseteq A_3)=P(A_1\subseteq A_2)P(A_2\subseteq A_3)$?"

Taking these questions one at a time:

For the first question, I am presuming that what is being asked is:

Why is it that (in general)
$p\left[(A_1 \cap A_2 \cap A_3) = \emptyset\right] \neq p\left[(A_4 \cap A_5) = \emptyset\right].$

I think that the best way to answer this is to consider that asking whether a group of sets have the emptyset as their intersection is equivalent to whether $n$ independent events have all occurred. Here, each event is that element $k$ is not in all of the sets.

If $2$ sets are involved, then the chance that element $k$ is not in both sets is $\frac{3}{4}.$ If $3$ sets are involved, then the chance that element $k$ is not in all three sets is $\frac{7}{8}$.

So the answer to the question is that the probabilities are different because $\left(\frac{3}{4}\right)^n \neq \left(\frac{7}{8}\right)^n.$

Your second question raises a much more subtle point:

Why is it that $P(A_1\subseteq A_2\subseteq A_3)$ is not equal to $P(A_1\subseteq A_2)P(A_2\subseteq A_3)$?

It is because (believe it or not) the events that $(A_1 \subseteq A_2)$ and $(A_2 \subseteq A_3)$ are not independent events. That is, when you have two random sets $A_1, A_2$ from $\mathcal{P}(S)$, on average, $A_1$ is going to have about $\frac{n}{2}$ elements. This means that (in general) $A_2$ must have those specific $\frac{n}{2}$ elements, and may well have other elements. It therefore becomes harder than normal for $A_2 \subseteq A_3$.

Rebuttal to the above hand-waving is that if $A_1 \subseteq A_2$, then it is less likely than normal that $A_1$ will have $\frac{n}{2}$ elements. This is true, but it doesn't mean that it is irrelevant that $A_1 \subseteq A_2$. That is, you should still expect it to be moderately harder for $A_2 \subseteq A_3$, than for some random set $A_k$ to be such that $A_k \subseteq A_3$.

In any case, with respect to the subtleties around your second question, these are informal handwaving arguments. The real proof is in the explicit analysis in my answer.

Best Answer

Related Solutions

Probability and subsets

Related Question