Probability – Variance of Rolling a Die Until All Numbers Appear

coupon-collectordiceprobability

We roll a die until we obtain all numbers from $1$ to $6$. I found the expected value of rolls computing it like $X = X_1 + \dots + X_6$ where $X_i$ is number of rolls needed to obtain a result different from previous $i-1$ and using a geometric distribution. And my result is correct. But then I wanted to find a variance. Firstly I thought of doing it this way: $$\text{Var} (X_1 + \dots + X_6) = \text{Var}(X_1) + \dots + \text{Var}(X_6) + 2 \sum_{1\le i<j\le6} \text{Cov}(X_i, X_j),$$ but covariance is not easy to find here. Can somebody please show me how to find a variance of number of dice rolls?

Best Answer

Just to give an explicit answer so as to point a later question here:

The variance for the coupon collector's problem of collecting all $n$ distinct and equally likely coupons is simply the sum of the $n$ different geometric distributions variances so is $$\sum_{k=1}^n \left(\left(\frac n k\right)^2 - \frac n k\right)$$

With $n=6$ this variance is exactly $38.99$ so the standard deviation is about $6.2441973$, which seems quite large when you consider that the expectation is $14.7$. Of the $38.99$, the part of the variance associated with collecting the final coupon is $30$.

An approximation for the variance is $$\frac{\pi^2}{6}n^2 - (\log_e(n)+1+\gamma)n - \frac{1}{12 n}$$ where $\gamma \approx 0.5772156649$ is the Euler-Mascheroni constant. With $n=6$ this is about $38.9898867$

Related Question