Coupon Collector problem – 2nd moment method

coupon-collectorprobabilityrandom variables

Recall the coupon collector problem, namely, drawing numbers (coupons) iid from $[n]$.
I want to prove that the probability that we drawed all numbers after $k=\frac{n\ln n}{4}$ tries tends to $0$ as $n$ goes to $\infty$.

For that, define $n$ random variables $X_1,\dots, X_n$ forwhich $X_i =1$ iff the $i$'th coupon wasn't drawed after $k$ tries and $X=\sum _i X_i$ that counts the number of coupons we didn't draw after $k$ tries. I want to prove that $\Pr [X=0]\xrightarrow{k \to \infty} 0$.

Observations/facts I have used (all easy to prove/known):

  1. $\Pr [X=0]\leq \frac{\text{Var}[X]}{\mathbb{E}^2[X]}$ .
  2. $\Pr[X_i=1]= \mathbb{E}[X_i]=\left(1-\frac{1}{n} \right)^k$
  3. $\mathbb{E}[X]=n\left(1-\frac{1}{n} \right)^k \ge n^{\frac{1}{2}}$
  4. $\text{Var}[X]=\sum_i \text{Var}[X_i]+\sum_{i\ne j} \text{Cov}[X_i,X_j]\leq \mathbb{E}[X] + \sum_{i\ne j} \text{Cov}[X_i,X_j]$
  5. $\sum_{i\ne j} \text{Cov}[X_i,X_j] \leq \sum_{i\ne j} \left(1-\frac{2}{n} \right)^k \leq n^2 \left(1-\frac{2}{n} \right)^k\leq n^{\frac{3}{2}}$.

Where I used:

  • $1-x \leq e^{-x}$ for all $x\in\mathbb{R}$,
  • $1-x \geq e^{-2x}$ for all $x\leq \frac{1}{4} $

and the definition of $k$ in 5 and 3 respectively.

We get:

$$\Pr [X=0]\leq o(1) + \frac{\sum_{i \ne j}\text{Cov}[X_i,X_j]}{\mathbb{E}^2[X]}\leq o(1) + \frac{n^{\frac{3}{2}}}{n}=o(1)+ \sqrt{n}$$ and I need a much better bound.

Appreciate any help/hints as I've been stuck on it for a couple of days.

Best Answer

Well, apparently the co-variance is negative:

$$\text{Cov} \left( X_i , X_j \right ) = \mathbb{E} \left( X_i X_j \right )-\mathbb{E} \left( X_i\right ) \mathbb{E} \left( X_j\right )$$

$$=\left( 1- \frac{2}{n} \right )^k -\left( 1- \frac{1}{n} \right )^{2k} \leq n^{-\frac{1}{2}}-n^{-\frac{1}{2}}\leq 0$$

By using the inequalities above and assuming $n\geq 4$ .

That finishes the analysis.

Related Question