[Math] Calculating independent indicator random variables

probability

An urn contains 30 balls, of which 10 are red, 8 are blue, and 12 are white.
From this urn, 12 balls are randomly withdrawn without replacement. Let X
denote the number of red and Y the number of blue balls that are withdrawn.
Let $X_i$ be the indicator random variable of the ith red ball being drawn and $Y_j$ be the indicator random variable of the jth blue ball being drawn.
Evaluate $E[X_i]$, $E[Y_j]$ and $E[X_iY_j]$.

Could someone explain to me part of the solution given?:

$E[X_i] = P(\text{red ball withdrawn}) = \frac{29\choose11}{30\choose12} = 0.4$,

$E[Y_i] = P(\text{Blue ball withdrawn}) = \frac{29\choose11}{30\choose12} = 0.4$,

Since $E[X_i] $ and $E[Y_j]$ are independent, we have $E[X_iY_j] = 0.4 X 0.4 =0.16$ ***

This last statement is what I feel queasy about. It makes a little bit of sense to me that X and Y should be somewhat independent. Does $E[X_iY_j] = 0.16$ mean that the probability of drawing both a red ball and blue ball is 0.16? It appears that this value should be $\frac{28\choose10}{30\choose 12}$ Since it is without replacement shouldn't the events be dependent?

In a similar light, suppose that I have 4 balls, Red Yellow Blue Green, and I draw two balls randomly. The probability of drawing a red and yellow ball certainly isn't 1/4 * 1/4 = 1/16

What misconception am I having here?

Best Answer

$X_i$ is the indicator that: the $i$-th red ball has been selected; for $1\leq i\leq 10$.   Thus $$\mathsf E(X_i) = \mathsf P(\text{that event}) = \dfrac{\binom{29}{11}\binom{1}{1}}{\binom{30}{12}}$$

Because the favoured space is ways to pick that ball and $11$ of the $29$ others, in the total space of ways to pick any $12$ of all $30$ balls.

Likewise for $\mathsf E(Y_j)$ for $1\leq j\leq 8$.

Similarly $X_iY_j$ is the indicator that both the $i$-th red and $j$-th blue ball have been picked.   So yes, clearly $\mathsf E(X_iY_j) = \dfrac{\binom{28}{10}}{\binom{30}{12}}$.   The indicators are not of independent outcomes.


These values are used in calculating the covariance of the two random variables.   This should be anticipated to be other than zero, and indeed negative, because when more red balls are picked fewer blue balls can be; and vice versa.

$$\begin{align} \mathsf{Cov}(X,Y) & = \mathsf E(XY)-\mathsf E(X)\mathsf E(Y) \\[1ex] & = \sum_{i=1}^{10}\sum_{j=1}^{8}\mathsf E(X_iY_j) - \sum_{i=1}^{10}\mathsf E(X_i) \sum_{j=1}^{8}\mathsf E(Y_j) \\[1ex] & = \dfrac{10\cdot 8\cdot\binom{28}{10}}{\binom{30}{12}} - \frac{10\cdot\binom{29}{11}}{\binom{30}{12}}\frac{8\cdot\binom{29}{11}}{\binom{30}{12}} \\[1ex] & =\dfrac{80\cdot\left(\binom{28}{10}\binom{30}{12}-\binom{29}{11}^2\right)}{\binom{30}{12}^2} \\[2ex] & = \dfrac{-96}{145} \end{align}$$

As anticipated.