Central Limit Theorem – Origin and Significance of ?n

central limit theoremintuition

A very simple version of central limited theorem as below
$$
\sqrt{n}\bigg(\bigg(\frac{1}{n}\sum_{i=1}^n X_i\bigg) – \mu\bigg)\ \xrightarrow{d}\ \mathcal{N}(0,\;\sigma^2)
$$
which is Lindeberg–Lévy CLT. I do not understand why there is a $\sqrt{n}$ on the left handside. And Lyapunov CLT says
$$
\frac{1}{s_n} \sum_{i=1}^{n} (X_i – \mu_i) \ \xrightarrow{d}\ \mathcal{N}(0,\;1)
$$
but why not $\sqrt{s_n}$? Would anyone tell me what are these factors, such $\sqrt{n}$ and $\frac{1}{s_n}$? how do we get them in the theorem?

Best Answer

Nice question (+1)!!

You will remember that for independent random variables $X$ and $Y$, $Var(X+Y) = Var(X) + Var(Y)$ and $Var(a\cdot X) = a^2 \cdot Var(X)$. So the variance of $\sum_{i=1}^n X_i$ is $\sum_{i=1}^n \sigma^2 = n\sigma^2$, and the variance of $\bar{X} = \frac{1}{n}\sum_{i=1}^n X_i$ is $n\sigma^2 / n^2 = \sigma^2/n$.

This is for the variance. To standardize a random variable, you divide it by its standard deviation. As you know, the expected value of $\bar{X}$ is $\mu$, so the variable

$$ \frac{\bar{X} - E\left( \bar{X} \right)}{\sqrt{ Var(\bar{X}) }} = \sqrt{n} \frac{\bar{X} - \mu}{\sigma}$$ has expected value 0 and variance 1. So if it tends to a Gaussian, it has to be the standard Gaussian $\mathcal{N}(0,\;1)$. Your formulation in the first equation is equivalent. By multiplying the left hand side by $\sigma$ you set the variance to $\sigma^2$.

Regarding your second point, I believe that the equation shown above illustrates that you have to divide by $\sigma$ and not $\sqrt{\sigma}$ to standardize the equation, explaining why you use $s_n$ (the estimator of $\sigma)$ and not $\sqrt{s_n}$.

Addition: @whuber suggests to discuss the why of the scaling by $\sqrt{n}$. He does it there, but because the answer is very long I will try to capture the essense of his argument (which is a reconstruction of de Moivre's thoughts).

If you add a large number $n$ of +1's and -1's, you can approximate the probability that the sum will be $j$ by elementary counting. The log of this probability is proportional to $-j^2/n$. So if we want the probability above to converge to a constant as $n$ goes large, we have to use a normalizing factor in $O(\sqrt{n})$.

Using modern (post de Moivre) mathematical tools, you can see the approximation mentioned above by noticing that the sought probability is

$$P(j) = \frac{{n \choose n/2+j}}{2^n} = \frac{n!}{2^n(n/2+j)!(n/2-j)!}$$

which we approximate by Stirling's formula

$$ P(j) \approx \frac{n^n e^{n/2+j} e^{n/2-j}}{2^n e^n (n/2+j)^{n/2+j} (n/2-j)^{n/2-j} } = \left(\frac{1}{1+2j/n}\right)^{n+j} \left(\frac{1}{1-2j/n}\right)^{n-j}. $$

$$ \log(P(j)) = -(n+j) \log(1+2j/n) - (n-j) \log(1-2j/n) \\ \sim -2j(n+j)/n + 2j(n-j)/n \propto -j^2/n.$$

Related Question