Central Limit Theorem – Detailed Understanding

central limit theorem

What is wrong with the following sentence:

The Central Limit Theorem implies that, as the sample size grows, the error distribution approaches normality.

Am I correct by saying that it should in stead state the MEAN of the sample error approaches zero as sample size grows?

Best Answer

In its standard simplest form, the Central Limit Theorem (CLT) is a statement about the cumulative distribution function of the random variable $$Z_n = \frac{X_1 + X_2 + \cdots + X_n -n\mu}{\sigma \sqrt{n}}$$ where the $X_i$ are independent identically distributed random variables with mean $\mu$ and standard deviation $\sigma$. The CLT asserts that for each $a$, $-\infty < a < \infty$, $$F_{Z_n}(a) = P\left\{\frac{X_1 + X_2 + \cdots + X_n -n\mu}{\sigma \sqrt{n}} \leq a \right\} \to \Phi(a) = \int_{-\infty}^a \frac{e^{-x^2/2}}{\sqrt{2\pi}}\mathrm dx$$ as $n \to \infty$.

If by "error distribution" you mean the distribution function of $$Y_n = \left(\frac{1}{n}\sum_{i=1}^n X_i\right) -\mu = \frac{\sigma}{\sqrt{n}}Z_n,$$ that is, the difference of the sample mean $\bar{X} = n^{-1}\sum_iX_i$ and the population mean $\mu$, then the CLT certainly does not imply that $F_{Y_n}(\cdot)$ "approaches normality" as the sample size $n$ grows large in the usual sense of normality, though nitpickers may want to claim that the distribution is approaching a normal distribution with mean $0$ and standard deviation $0$ (often called a constant by statistically illiterate people).

On the other hand, the mean of the sample error is not a random variable but a constant (in fact, $0$ since the sample mean is an unbiased estimator of the population mean) and does not need to approach $0$; it is already there! I think what you meant to say is that the distribution $F_{Y_n}(a)$ of the sample error approaches the unit step function: $$F_{Y_n}(a) \to u(a) = \begin{cases}1, & \text{if}~a > 0,\\ 0 &\text{if}~a < 0,\end{cases}$$ which is certainly correct, and follows from the CLT, but also follows from results such as the weak law of large numbers which makes no assertions about the distribution of $Z_n$, only about $Y_n$.

Related Question