Solved – Should I use a binomial cdf or a normal cdf when flipping coins

binomial distributionnormal distributionself-study

A coin needs to be tested for fairness. 30 heads come up after 50 flips. Assuming the coin is fair, what is the probability that you would get at least 30 heads in 50 flips?

The right way to do this problem, according to my teacher, is to do

normalcdf(min = .6, max = ∞, p = .5, σ = sqrt(.5 * .5 / 50) = 0.0786

However, I took a binomial cumulative distribution function like this

1 - binomcdf(n = 50, p = .5, x = 29) = 0.1013

I believe the criteria for a binomial distribution are satisfied: the individual events are independent, there are only two possible outcomes (heads vs. tails), the probability is constant for the question (0.5), and the number of trials is fixed at 50. Yet obviously, the two methods give different answers, and a simulation supports my answer (at least the few times I ran it; obviously, I can't guarantee that you'd get the same results).

Is my teacher wrong in assuming that a Normal distribution curve would also be a valid way to do this problem (at no point is it said that the distribution is Normal, but n*p and n*(1-p) are both greater than 10), or have I misunderstood something about binomial distributions?

Best Answer

Here is an illustration of the answers of whuber and onestop.

continuity correction

In red the binomial distribution $\mathcal Bin(50,0.5)$, in black the density of the normal approximation $\mathcal N(25, 12.5)$, and in blue the surface corresponding to $\mathbb P(Y > 29.5)$ for $Y \sim \mathcal N(25, 12.5)$.

The height of a red bar corresponding to $\mathbb P(X=k)$ for $X\sim\mathcal Bin(50,0.5)$ is well approximated by $\mathbb P\left( k -{1\over 2} < Y < k + {1\over 2}\right)$. To get a good approximation of $\mathbb P(X \ge 30)$, you need to use $\mathbb P(Y>29.5)$.

(edit) This is $$\mathbb P(Y>29.5) \simeq 0.1015459,$$ (obtained in R by 1-pnorm(29.5,25,sqrt(12.5))) whereas $$\mathbb P(X \ge 30) \simeq 0.1013194:$$ the approximation is correct.

This is called continuity correction. It allows you to compute even "point probabilities" like $\mathbb P(X=22)$ : $$\begin{align*} \mathbb P(X=22) &= {50 \choose 22} 0.5^{22} \cdot 0.5^{28} \simeq 0.07882567, \\ \mathbb P(21.5 < Y < 22.5) & \simeq 0.2397501 - 0.1610994 \simeq 0.07865066.\end{align*}$$

Related Question