[Math] Randomly selected subset, expected value of the sum

probability

Interesting problem I spotted while learning:

Let $X=\left\{1,..,n\right\}$. We randomly select subset of $X$ and name it $A$. Each subset if equally likely.

a) Find the expected value of the sum of elements of A.

b) Find the expected value of the sum of elements of A, on condition that it has $k$ elements.

a) I think I know how to solve a). If each subset is selected with the same probability then I think it is equivalent to selecting each element of $X$ with probability $\frac{1}{2}$. So, using indicators, we got that expected value we are looking for is $\frac{n(n+1)}{4}$. But I can't find any rigorous argument why it is equivalent to selecting each element with probability $1/2$.

b) Small observation with $k=1$ (each element selected with probability $1/n$) and $k=n$ (each element selected with probability $1$) gives me feeling that approach from a) can be used with probability $k/n$ and then the result is $\frac{k(n+1)}{2}$. But it is much less intuitive than observation in a). No idea, how to prove this. Can anyone help?

Best Answer

Regarding your first question, note that a subset has an equivalent representation as a binary sequence. For example, for $n = 4$, the subset $\{1,3\}$ can be identified with $(1,0,1,0)$. Now, picking a subset uniformly at random, is like picking a sequence uniformly at random out of the $2^n$ possibilities. You should be able to argue the rest.

Regarding your 2nd question, let $X_i$ be the indicator that the $i$ is present in the random subset (i.e., $i$-th position in the binary representation above has a $1$) Then, you want $$ E\Big[ \sum_{i=1}^n i X_i \Big| \sum_{i=1}^n X_i = k\Big] = \sum_{i=1}^n i\, E\Big[X_i \Big| \sum_j X_j = k\Big] $$ Now, you can use symmetry. Let $a_i :=E[X_i \Big| \sum_j X_j = k]$ (which is the conditional probability of choosing the $i$). Then all $a_i$ should be equal and $\sum_i a_i = k$ (why?), from which it follows that $a_i = k/n$. This give you the answer that you have, and also shows that the conditional probability of choosing the $i$ is $k/n$ for every $i$.

Related Question