[Math] Why does Binomial distribution fits Normal distribution

central limit theoremprobabilityprobability theoryprobability-limit-theoremsstatistics

I use Statgraphics for working with statistics.

I generated 3 Binomial samples:

1) N=100, p=0.5

2) N=100, p=0.01

3) N=100, p=0.99

First sample looks like Normal sample:

First sample

And it passes Kolmogorov-Smirnov test:

KS test

So I have theoretical question from probability theory: why first sample acts like Normal and why other two don't?

Best Answer

Let $X_1, X_2, \dots$ be a sequence of independent, identically distributed random variables with common mean $\mu$ and variance $\sigma^2$. Then the sum $T_n = \sum_{i=1}^n X_i$ of the first $n$ of the $X_i$ has $E(T_n) = n\mu$ and $Var(T_n) = n\sigma^2$. Thus $Z_n = \frac{T_n - n \mu}{\sqrt{n}\sigma}$ has $E(Z_n) = 1$ and $Var(Z_n) = 1.$

The Central Limit Theorem (CLT), says that the sequence $Z_n$ converges in distribution to standard normal. That is, $P(Z_n \le z) \rightarrow \Phi(z)$, for any real $z$, where $\Phi$ is the standard normal CDF. (A similar statement can be made for averages $\bar X_n = (1/n)\sum_{i=1}^n X_i$ and $Z_n = \frac{\bar X_n - \mu}{\sigma/\sqrt{n}}.$)

Whenever you want to put a limit theorem to practical use (such as approximation), your first question should be "how fast is the convergence?" Infinity itself is a 'long way away' Roughly speaking, the convergence in the CLT is more rapid for symmetrical random variables $X_i$ than for skewed ones. Thus, the sum of only ten independent $X_i \sim Unif(0,1)$ (symmetrical) is very nearly normal, while the sum of 25 independent random variables $X_i \sim Exp(1)$ (markedly skewed) is clearly not so well approximated by a normal distribution. The diagram below illustrates this with 10,000 such sums of each type of random variable. The appropriate normal density is shown with each histogram.

enter image description here

The speed of convergence is quite fast in some instances and rather slow in others. A binomial random variable $X \sim Binom(n, \theta)$ is the sum of $n$ independent Bernoulli random variables with success probability $\theta.$ So, for sufficiently large $n$ a Binomial random variable will have a distribution that is 'approximately' normal. In the binomial case, the convergence is much faster if $\theta \approx 1/2$ (symmetrical) than if $\theta$ is near 0 or 1. You have already discovered that $Binom(100, .5)$ is closer to normal than $Binom(100, .01).$

Sometimes a 'rule of thumb' is used. It says that the approximation of normal to binomial is 'reasonably' good if $n\theta$ and $n(1-\theta)$ both exceed 5. Few 'rules of thumb' are always accurate. There are better rules for normal/binomial fit, but this one is often quoted, probably because it is usually pretty good and it is easy to remember.

Related Question