[Math] Green balls and Red balls, probability problem

combinatoricsprobabilitystatistics

I'm studying for my exam and I came across the following draw without replacement problem :

$N$ boxes filled with red and green balls.
The box $r$ contains $r-1$ red balls and $N-r$ green balls.
We pick a box at random and we take $2$ random balls inside it, without putting them back.

1) What is the probability that the second ball is green ?

2) What is a probability that the second ball is green, knowing the first one is green.

I don't know where to start, all those dependance (to $r$ and $N$) are blowing my mind.

I don't know if I should concider it as a Binomial law (with Bernoulli : ball = green, $n=2, p = ?$) or go with the formula $$p(X=k)=\frac{C_{m}^{k} C_{N-m}^{n-k}}{C_{N}^{n}}$$
or something else…

Could someone advise me ?

Best Answer

All of the boxes contain $N - 1$ balls. This is just a complicated conditional probability problem. Lets look at a single box with $r$ red balls and $g$ green balls. What would the probability be of getting green on the second? Well it depends on whether or not you draw a red or green first. If you draw a red first, then there are $\left.p(\text{second green } \right| \text{ first red}) = \frac{g}{r + g - 1}$. However, if you draw a green ball first then you have one less green to choose from giving: $\left.p(\text{second green } \right| \text{ first green}) = \frac{g - 1}{g + r - 1}$. So what are the chances of each condition happening? $p(\text{first red}) = \frac{r}{g + r}$ and $p(\text{first green}) = \frac{g}{r + g}$. Therefore we can finally write:

\begin{align} p(\text{second green}) =& \left.p(\text{second green } \right| \text{ first red})p(\text{first red}) + \left.p(\text{second green } \right| \text{ first green})p(\text{first green})\\ =& \frac{r}{r + g}\frac{g}{r+g-1} + \frac{g}{r + g}\frac{g-1}{r+g-1} = \frac{g(r + g - 1)}{(r + g)(r + g - 1)} = \frac{g}{r + g} \end{align}

Not surprising that drawing the second green has just as good of a chance of being green as the first pick.

Therefore for each of the $N$ boxes you need to compute $p(\text{second green})$ (which is just the probability of drawing a green on the first try). Now the condition is that we choose box $r$ which has $p(\text{second green}) = p(\text{first green}) = \frac{N - r}{N - 1}$. The probability of choosing box $r$ among $N$ boxes is just $\frac{1}{N}$ which gives:

$$ p(\text{second green}) = \sum_1^N \frac{1}{N}\frac{N - r}{N - 1} = \frac{1}{N(N - 1)}\sum_1^r (N - r) $$

The first sum is very easy (you're just summing the same number, $N$, $N$ times) $\sum_1^N N = N\cdot N = N^2$. The second part is easy if you remember the sum of the first $n$ consecutive integers is $\sum_1^n i = \frac{n(n + 1)}{2}$. So this gives:

$$ p(\text{green}) = \frac{N^2 - \frac{N(N + 1)}{2}}{N(N - 1)} = \frac{2N^2 - N^2 - N}{2N(N - 1)} = \frac{N^2 - N}{2\left(N^2 - N\right)} = \frac{1}{2} $$

For part $2$), we actually already computed that above: $\left.p(\text{second green }\right|\text{ first green}) = \frac{g - 1}{g + r - 1}$. But now you need to sum over the condition that it could be any of the $N$ boxes (edit: However, the last box, box $N$, has $0$ green balls (and thus seeing green first means it definitely wasn't this box. So we should only sum over the first $N - 1$ boxes and divide by $N - 1$, not $N$.):

\begin{align} \left.p(\text{second green }\right|\text{ first green}) =& \sum_1^{N - 1} \frac{1}{N - 1}\frac{N - r - 1}{N - 2} \\ =& \frac{N(N - 1) - (N - 1) - \frac{N(N - 1)}{2}}{N(N - 2)} \\ =& \frac{2N(N - 1) - 2(N - 1) - N(N - 1)}{2(N - 1)(N - 2)}\\ =& \frac{N(N - 1) - 2(N - 1))}{2(N - 1)(N - 2)} \\ =& \frac{(N - 1)(N - 2)}{2(N - 1)(N - 2)} \\ =& \frac{1}{2} \end{align}

This is only valid for $N > 2$ (since if $N = 1$ there are no balls in each box and if $N = 2$ there is only one ball in each box). This result just confirms that drawing balls are independent events.