Binomial Distribution – When to Use and Not Use the Rule of Three in Probability

binomial distributionprobability

The rule of three in statistics states that if an event is binomially distributed and does not occur with in $n$ trials the maximum chance of it occurring is approximately $3/n$. Suppose we have a roulette table with only two options, red or black. The chance of either of these occurring is clearly 1/2. Suppose, however, that we don't see black for 10 turns of the wheel. We might then decide to reason, ignoring the prior knowledge of the distribution, that the chance of black occurring is at most 3/10, which is not true. Is this a misapplication of the rule? If so, why, and how does one determine when it is proper to apply it.

Best Answer

The rule of three in statistics states that if an event is binomially  
distributed and does not occur with in n trials the maximum chance of it     
occurring is approximately 3/n. 

No, that's not what it says. It says that a 95% confidence interval for the actual chance of it occurring is approximately [0, 3/n]. That is not the same thing. The largest value for the 'chance of occurring' contained in the interval is indeed 3/n, though the question of which of the values within the interval is most likely is not answered.

The rule says: 'guess that the true chance of occurring is 3/n or less and you will be wrong about 5% of the time.

Suppose we have a roulette table with only two options, red or black. The 
chance of either of these occurring is clearly 1/2. 

Exactly, so there is no need for a confidence interval because the 'chance of occurring' is known. You could, on the other hand, test the coverage of the approximate interval that the rule provides using such a wheel.

Suppose, however, that we don't see black for 10 turns of the wheel. 
We would then reason that the chance of black occurring is at most 3/10,     
which is not true. Is this a misapplication of the rule? If so, why, and 
how does one determine when it is proper to apply it.

It is a misapplication of the idea of a confidence interval, which is applied to bound the range of plausible values of things that are unknown, and which in any particular application need not contain the true value if it becomes known.