Probability of choosing disjoint sets from $\{1,2,…,n\}$

probability

The set $S$ is the set of all integers from $1$ to $n$. The sets $A_1, A_2, . . ., A_m$, which are not necessarily distinct, are chosen randomly and
independently from $\mathcal{P}(S)$, and for each $k$ $(1 \le k \le m)$, the set $A_k$ is equally likely to be any of the sets in $\mathcal{P}(S)$.

(a) By considering each integer separately, show that $P(A_1 ∩ A_2 = ∅) = (\frac{3}{4})^n$
and find $P(A_1 ∩ A_2 ∩ A_3 = ∅)$ and $P(A_1 ∩ A_2 ∩ · · · ∩ A_m = ∅)$

(b) Find $P(A_1 ⊆ A_2) , P(A_1 ⊆ A_2 ⊆ A_3)$ and $P(A_1 ⊆ A_2 ⊆ · · · ⊆ A_m)$ .

I think the question I want to ask is "Why does independence not guarantee $P(A_1\cap A_2\cap A_3)=P(A_k\cap A_3)$ and $P(A_1\subseteq A_2\subseteq A_3)=P(A_1\subseteq A_2)P(A_2\subseteq A_3)$?" but I'll show my method to this question and the reason it is wrong might be different, I'm not sure. I already have model solutions to this question, so knowing the answer is not my interest here.

For $a)$, it is not too difficult to show the first result considering $\sum_{r=1}^n 1-P(r\in A_1)P(r\in A_2)$. I wanted to generalise this by saying that $A_1\cap A_2$ is an arbitrary set in $\mathcal{P}(S)$, say $A_k$, so $P(A_1\cap A_2\cap A_3=\emptyset)=P(A_k\cap A_3=\emptyset)=\left(\frac{3}{4}\right)^n$,but this is wrong and also feels unsettling to have the probability unchanged and I can't see why the method isn't valid. Of course, it is not much a leap further to say $A_1\cap A_2\cap A_3$ is an arbitrary set in $\mathcal{P}(S)$ instead giving the probability to be $\frac{1}{2^n}$!

For $b)$, I tried to use $a)$. $P(A_1\subseteq A_2)=P(A_1^c\cap A_2=\emptyset)=\left(\frac{3}{4}\right)^n$ and this works in this case but if we try use it again $P(A_1\subseteq A_2 \subseteq A_3)=P(A_1\subseteq A_2)P(A_2\subseteq A_3)=\left(\frac{3}{4}\right)^{2n}$, it does not give the correct answer, I presume the first equality is where it goes wrong but I don't know why.

There is discussion of this question here but I don't think it is related.

Best Answer

As AnilCh indicated in a comment following my answer, while my answer solves the underlying math questions, it does not resolve the OP's specific questions. I have therefore added an Addendum to respond to these questions.


Part (a)

Note that the sets $A_1, \cdots, A_m$ are selected with replacement from $\mathcal{P}(S)$. For any random set $B \in \mathcal{P}(S)$, and any random element $k \in \{1,2, \cdots, n\}$, the chance that $k \in B$ is $(1/2).$

Given any two (not necessarily) distinct sets $A_1, A_2$ from $\mathcal{P}(S)$, the chance that a random element $k$ is not in both sets is $1 - (1/2)^2 = (3/4)$. Further, whether one element $k_1$ is in a set does not affect whether a different element $k_2$ is in the same set, where the set is taken at random from $\mathcal{P}(S)$.

This means that the event that $k_1$ is missing from at least one of a group of sets is an independent event with respect to whether $k_2$ is missing from at least one of the same group of sets.

The chance that a specific element $k$ is not in all of $A_1, \cdots, A_m$ is $\frac{2^m - 1}{2^m}$. Therefore, for the $m$ sets to have the empty set as their intersection, all $n$ elements must be such that none of them are in all of $A_1, \cdots, A_m$.

This means that the chance that the intersection of $A_1, \cdots, A_m$ is the empty set is $$\left(\frac{2^m - 1}{2^m}\right)^n.$$


Part (b)

In order to have $A_1 \subseteq A_2$, you must have that for each element $k \in \{1,2,\cdots, n\}$, it is not the case that $k \in A_1$ and simultaneously $k \not\in A_2$.

The chance that a specific element $k$ satisfies the above constraint is $(3/4)$. Therefore, the chance that $A_1 \subseteq A_2$ is $(3/4)^n$.

To consider the chance of $A_1 \subseteq A_2 \subseteq A_3$, you again go element by element. Consider the following truth table, based on the issue of which sets contain the element $k$.

\begin{array}{| r | r | r | r |} \hline k \in A_1 & k \in A_2 & k \in A_3 & A_1 \subseteq A_2 \subseteq A_3\\ \hline T & T & T & T\\ \hline T & T & F & F\\ \hline T & F & T & F\\ \hline T & F & F & F\\ \hline F & T & T & T\\ \hline F & T & F & F\\ \hline F & F & T & T \\ \hline F & F & F & T\\ \hline \end{array}

The above chart signifies that the chance that $A_1 \subseteq A_2 \subseteq A_3$, when attention is confined to a specific element $k$ is $(4/8) = (1/2)$.

Therefore, the chance that (with respect to all $n$ elements), $A_1 \subseteq A_2 \subseteq A_3 = (1/2)^n$.

Determining the specific chance that $A_1 \subseteq \cdots \subseteq A_m$ with respect to a specific element $k$ will take some work.

Suppose that with respect to a specific element $k$, the chance is $r \in (0,1)$, for $L$ sets, where $L \in \{1,2,\cdots, (m-1)\}.$ Then, when you add the set $A_{L+1}$, (1/2) the time, $k \in A_{L+1}$, and $(1/2)$ the time, $k \not\in A_{L+1}$.

The $(1/2)$ the time that $k \in A_{L+1}$, the probability $r$ pertains. The $(1/2)$ the time that $k \not\in A_{L+1}$, $k$ will have to also be missing from all of the other sets.

Therefore, with respect to a single element $k$, if the probability of $A_1 \subseteq \cdots \subseteq A_L = r$, then the probability of $A_1 \subseteq \cdots \subseteq A_L \subseteq A_{L+1}$ is equal to

$$\left[\frac{1}{2} \times r\right] + \left[\frac{1}{2} \times \frac{1}{2^L}\right] = \left[\frac{1}{2} \times r\right] + \frac{1}{2^{L+1}}.$$

What is needed is a closed form expression for the probability of $A_1 \subseteq \cdots \subseteq A_m$, as a function of $m$, when attention is confined to only one specific element $k$. Consider the following table.

\begin{array}{| r | r |} \hline m & p\{A_1 \subseteq \cdots \subseteq A_m\} \\ \hline 2 & \displaystyle \frac{3}{4} \\ \hline 3 & \displaystyle \frac{4}{8} \\ \hline 4 & \displaystyle \frac{5}{16} \\ \hline \end{array}

So, the closed form that represents the above table is $$\frac{m+1}{2^m}.$$

Therefore, since you have to focus on the $n$ independent events that represent each of the $n$ elements, the chance that $$A_1 \subseteq \cdots \subseteq A_m = \left(\frac{m+1}{2^m}\right)^n.$$


Addendum

I think the question I want to ask is "Why does independence not guarantee $P(A_1\cap A_2\cap A_3)=P(A_k\cap A_3)$ and $P(A_1\subseteq A_2\subseteq A_3)=P(A_1\subseteq A_2)P(A_2\subseteq A_3)$?"

Taking these questions one at a time:

For the first question, I am presuming that what is being asked is:

Why is it that (in general)
$p\left[(A_1 \cap A_2 \cap A_3) = \emptyset\right] \neq p\left[(A_4 \cap A_5) = \emptyset\right].$

I think that the best way to answer this is to consider that asking whether a group of sets have the emptyset as their intersection is equivalent to whether $n$ independent events have all occurred. Here, each event is that element $k$ is not in all of the sets.

If $2$ sets are involved, then the chance that element $k$ is not in both sets is $\frac{3}{4}.$ If $3$ sets are involved, then the chance that element $k$ is not in all three sets is $\frac{7}{8}$.

So the answer to the question is that the probabilities are different because $\left(\frac{3}{4}\right)^n \neq \left(\frac{7}{8}\right)^n.$


Your second question raises a much more subtle point:

Why is it that $P(A_1\subseteq A_2\subseteq A_3)$ is not equal to $P(A_1\subseteq A_2)P(A_2\subseteq A_3)$?

It is because (believe it or not) the events that $(A_1 \subseteq A_2)$ and $(A_2 \subseteq A_3)$ are not independent events. That is, when you have two random sets $A_1, A_2$ from $\mathcal{P}(S)$, on average, $A_1$ is going to have about $\frac{n}{2}$ elements. This means that (in general) $A_2$ must have those specific $\frac{n}{2}$ elements, and may well have other elements. It therefore becomes harder than normal for $A_2 \subseteq A_3$.

Rebuttal to the above hand-waving is that if $A_1 \subseteq A_2$, then it is less likely than normal that $A_1$ will have $\frac{n}{2}$ elements. This is true, but it doesn't mean that it is irrelevant that $A_1 \subseteq A_2$. That is, you should still expect it to be moderately harder for $A_2 \subseteq A_3$, than for some random set $A_k$ to be such that $A_k \subseteq A_3$.

In any case, with respect to the subtleties around your second question, these are informal handwaving arguments. The real proof is in the explicit analysis in my answer.

Related Question