[Math] Central Limit Theorem Definition

normal distributionprobabilityprobability distributionsprobability theorystatistics

My friend and I have a bet going about the definition of the Central Limit Theorem.

If we define an example as a number drawn at random from some probability density function where the function has a defined finite mean and variance. And we define a sample as a set of size N examples (with N>1).

Then, we take S samples and create a sampling distribution D over the means of each individual sample.

I am arguing that the Central Limit Theorem states that as the number of samples S approaches infinity, then the sampling distribution D will approximate a normal distribution.

My friend is arguing that the Central Limit Theorem states that given any number of samples S, sampling distribution D will not necessarily approximate a normal distribution, but as the number of examples per sample N approaches infinity, then D will approximate a normal distribution.

Who is right?

Update: I lost this bet.

Best Answer

I'd say that your friend is more correct, in that he/she correctly points to $N$ (sample size=number of values that are summed to compute the average,i.e. the "sample mean") as the thing that must tend to $\infty$ for the CLT to hold.

We have

$$S_N=\frac{X_1+X_2+ \cdots +X_N}{N}$$

Here, in our setting, the set $\{X_1, \ldots X_N\}$ is one sample, of size $N$; and $S_N$ is the sample-mean (=average) of that sample.

This $S_N$ is a random variable (informally, it takes different random values for each sample). What the CLT says is about the distribution of this $S_N$ as $N\to \infty$. Of course, if you were practically interested in checking/experiencing that $S_N$ (for some fixed $N$) is indeed approximately gaussian, you might want to draw many values of $S_N$ and, eg, draw an histogram; for this, you would need to draw a lot of samples (each of size $N$). But this has nothing to do with the asymptotics of the theorem.

Related Question