Expected Value of the Minimum with Limited Independence – Probability Analysis

pr.probability

Imagine you sample $n$ number with replacement uniformly from the integers $1,\dots, n$. Let $X$ be the minimum of these samples. I am interested in $\mathbb{E}(X)$ but with a twist. All I know is that the samples are uniform and pairwise independent.

Assuming $n$ is large, what bounds can one get for $\mathbb{E}(X)$?

If we generalize this to $k$-wise independence, for $k \geq 2$, what can we say?

[Also asked at https://math.stackexchange.com/questions/1179943/expected-value-of-the-minimum-with-limited-independence previously. ]

Best Answer

Let $u_k$ be the number of variables with value exactly $k$. If you pick a distribution of the $u_k$ such that $E[u_k]=1$, $E[u_k^2] = 2-1/n$, $E[u_ku_l] = 1-1/n$ for $k \neq l$ and $\sum_{k=1}^n u_k$ is always $1$, then by choosing a random $u_1, \dots ,u_n$ according to the distribution and then choosing a random ordering, you get a random set of variables with the desired independence property.

On the other hand if you do the same thing for $n+1$ pairwise independent random variables you get $E[u_k]=1+1/n$, $E[u_k^2] = 2+ 2/n$, $E[u_k u_l] = 1+1/n$. Here is a way to construct a distribution on $u_k$ that satisfies these conditions. To get a distribution for $n$ pairwise independent random variables, we just delete the last variable. The formula is:

With probability $(1+1/n) /k(k+1)$, we have $u_k=k+1$, $u_l =0$ for $l <k$, $u_l=1$ for $l>k$. To compute the three moments, we ignore a factor of $1+1/n$:

Mean:

$$ \frac{(k+1)}{k(k+1)} + \sum_{l < k} \frac{1}{l(l+1) } = \frac{1}{k} + 1- \frac{1}{k} = 1 $$

Mean of square:

$$ \frac{(k+1)^2}{k(k+1)} + \sum_{l < k} \frac{1}{l(l+1) } = \frac{k+1}{k} + 1- \frac{1}{k+1} = 2 $$

Mean of product is the same as mean, since if $l<k$ then $u_l=1$ whenever $u_k>0$.

Then put back in the factor of $(1+1/n)$ to get the right answer.

The expected value of the minimum of the $n+1$ variables at least $\sum_{k=1}^n (1+1/n) /(k+1) \approx \log n$. Removing a variable won't make the minimum any smaller.