Probability of a blue ball when selecting from a bag with substitution

probabilityprobability distributionsstatistics

I have a bag with an random, unknown number of all blue balls. It's OK to say the number is within some reasonable rage (tens?). Each round I pull out a ball. If it's blue, I remove it and add a red ball. If it's red I put it back. I want to repeat this until I think no blue balls are left — or the probability of getting a blue ball is low enough (1%?) that I can stop.

If I know the number of blue balls I've pulled, which equals the red balls that I've added, can I estimate the probability of pulling another blue ball?

For what it's worth, I'm most interested in estimating when I can stop — when the probability of pulling a blue ball falls below, say, 1%, and it's just not worth proceeding.

Best Answer

One reasonable criteria for stopping is if you see a lot of red balls in a row. (If you are still pulling blue balls, you should keep going since there still seem to be a non-negligible proportion of blue balls.) The question is how many red balls in a row do you need to see before you stop?

Let's say $B$ is the original number of blue balls, and $b$ is the number of blue balls you've seen already, so that the chance of drawing a blue ball is now $p:=(B-b)/B$. Let's say you drew $n$ red balls in a row in this streak. The $X \sim \text{Binomial}(n, p)$ distribution is approximately $Y \sim N(np, np(1-p))$. The chance of seeing no blue balls in $n$ draws is $$P(X \le 0) \approx P(Y \le 0) = \Phi\left(\frac{-np}{\sqrt{np(1-p)}}\right) = \Phi\left(-\sqrt{np/(1-p)}\right).$$

For fixed $p > 0$, taking $n \to \infty$ causes this probability to converge to zero (this makes sense, when $p > 0$, you should eventually see a blue ball). The closer $p$ is to zero, the higher $n$ must be to make this probability small. If you have a target in mind, like the $1\%$ you mentioned, you can set $p=0.01$ and choose $n$ large enough to make $\Phi\left(-\sqrt{np/(1-p)}\right)$ small enough (you have to decide how small is small). Then you can say "I saw a streak of $n$ red balls. If the proportion of blue balls in the bag at the time was $ \ge 1\%$, this would have happened with probability $\le (\text{small number})$, so I will stop."

Related Question