Probability – Variance of Sum of Numbers on Drawn Tickets

expected valueprobabilityvariance

$m$ tickets are drawn out of $n$ tickets which are numbered from $1$ to $n$. If $X$ denote the sum of the numbers on the tickets drawn. Find $V(X)$.

$X = X_1+X_2+\cdots+X_m$ , if $X_i$ can be treated as the $i$th number drawn.
Otherwise, $X_i$ can be treated as the indicator variable of the number $i=1,2,…,n$.

In either way, I am able to get expectation since dependence of variables does not matter. However, while calculating Variance, dependence does matter. While calculating $E(X_iX_j)$ the second draw is supposed to be dependent on the first draw since there is a constraint of the sum $X$.

Please answer.

Best Answer

Here is a slightly indirect way of obtaining the variance:

Let $X_k$ be the number on the $k$th ticket, $k=1,2,\ldots,m$.

So we have a uniform distribution for the $X_k$'s, namely

$$ P(X_k=j)=\begin{cases}\frac{1}{n}&,\text{ if }j=1,2,\cdots,n\\\\\,0&,\text{ otherwise }\end{cases}$$

So,

\begin{align} \operatorname{Var}(X_k)&=E(X_k^2)-(E(X_k))^2 \\\\&=\frac{n^2-1}{12}=\sigma^2\,,\text{ say } \end{align}

If the correlation between $X_i$ and $X_j$ $\,(i\ne j)$ be $\rho$, then $$\rho=\dfrac{\text{Cov}(X_i,X_j)}{\sigma^2}$$

You are looking for \begin{align}\operatorname{Var}(X)&=\operatorname{Var}\left(\sum_{k=1}^m X_k\right)\\&=\sum_{k=1}^m \operatorname{Var}(X_k)+2\sum_{i<j}\text{Cov}(X_i,X_j)\\&=m\sigma^2+2\binom{m}{2}\rho\sigma^2 \\&=m\sigma^2(1+(m-1)\rho)\tag{1}\end{align}

Now note that the joint distribution of $(X_i,X_j)\,,i\ne j$ is independent of $m$.

So we see that

\begin{align} \operatorname{Var}\left(\sum_{k=1}^{\color{red}{n}}X_k\right)&=\operatorname{Var}(\text{constant})=0 \\&\implies\color{red}{n}\sigma^2(1+(\color{red}{n}-1)\rho)=0 \\&\implies\rho=\frac{1}{1-n} \end{align}

Substituting this value of $\rho$ and the value of $\sigma^2$ in $(1)$, we finally get the variance of $X$ as

$$\operatorname{Var}(X)=\frac{m(n+1)(n-m)}{12}$$