[Math] Normal approximation to the binomial distribution

probability

As read on Wikipedia, the binomial distribution $B(n, p)$ is approximately normal with mean $np$ and variance $np(1−p)$ for large $n$ and for $p$ not too close to zero or one. Why ? Why this condition on $p$ ?

I know that the sum of Bernoulli distributions gives a binomial distribution, but this is not enough to apply the central limit theorem. Indeed the central limit theorem involves the mean of some variables not only the sum of these variables. Can somebody show me at which point one obtains a mean of variables ?

Best Answer

If $p$ is close to $0$ or $1$ you get perturbations because the binomial can't go beyond $0$ or $n$, while the normal distribution goes on forever. When $p$ is "reasonable" the missing tails are ignorably small, but if you have a significant chance of hitting the end, it won't be so close. Of course, if $p$ is exactly $0$ or $1$ you are fine, as the variance becomes zero.

Related Question