Creating an Example to Disprove the Central Limit Theorem

central limit theoremprobability

In class, we are always told that for the Central Limit Theorem to be applicable, observations have to be IID (Independent and Identically Distributed). However, we are not always told why this IID condition is so important for the Central Limit Theorem.

This being said, I am trying to create an example where the IID condition is not met and thus show myself why it is required.

Part 1: The first thing that comes to mind is an Autoregressive Process as by definition AR Processes are said not to be IID. For instance, suppose we have an AR(1) Process:

$$y_t = \phi y_{t-1} + \epsilon_t$$

Based on this AR(1) process, I know the following:

  • $E(\epsilon_t) = 0$
  • $E(\epsilon_t^2) = \sigma^2$
  • $E(\epsilon_t\epsilon_s) = 0$
  • $Var(y_t) = \frac{\sigma^2}{1-\phi^2}$
  • $E(y_t) = \phi E(y_{t-1}) = 0$

Part 2: As for the Central Limit Theorem, I know that in when $n$ is large, any Random Variable behaves as:

$$\frac{\bar{x} – E(X)}{\sqrt{\frac{Var(X)}{n}}} \approx N(0,1)$$

Part 3: Putting this all together, I would now show that the above AR(1) Process DOES NOT converge to a Standard Normal Distribution:

$$\frac{Y_t – E(Y_t)}{\sqrt{\frac{Var(Y_t)}{n}}} = \frac{Y_t – E(Y_t)}{\sqrt{\frac{\sigma^2}{1-\phi^2}\frac{1}{n}}} \not\approx N(0,1) $$

However, I am not sure if I am doing this correctly for the AR(1) Process and have in fact shown that in the absence of IID, the Central Limit Theorem is not necessarily valid.

In general, can someone please show me an example where the IID condition is not met and as a result the Central Limit Theorem does not apply?

Thanks!

Note: I am aware that there are versions of the Central Limit Theorem that do not require the IID Condition (e.g. https://en.wikipedia.org/wiki/Central_limit_theorem#Lyapunov_CLT, https://en.wikipedia.org/wiki/Lindeberg%27s_condition) – however, I am specifically interested in constructing an example that shows why the Classic Central Limit Theorem requires the IID condition.

Best Answer

Lets say Prof Scatterbrain is trying to demonstrate to his large class of 100 students the truth of the central limit theorem.

This theorem states roughly that "If you have a population with mean μ and standard deviation σ and take sufficiently large random samples from the population, then the distribution of the sample means will be approximately normally distributed."

He gets his hands on the results on the 2023 Boston Marathon sorted by bib number going from 1 to 20,000. (Note: I downloaded the actual results for authenticity).

He hands his secretary these results and asks her to type them up. He wants her to take the first 200 finishing times and put them on a single page. Type the next 200 on another page and so on until he has 200 independent finishing times for each student.

The secretary gets bored after 2 pages and just photocopies them 50 times so that she has 100 sheets of paper which she gives to the professor the next morning. The professor hands them out to the class and asks that each student calculate the mean of the results they have and to bring their answer in the next day.

The next day the professor compiles a histogram of the means supplied by the students, which he expects will look like a normal distribution. He is expecting something like this (This is actually from the real results and looks roughly normally distributed): What Professor Scatterbrain expects

But this is what he actually gets (Again, real, but only two different sets. It does not look normally distributed). What Professor Scatterbrain actually gets

The Professor quickly ends the class amid raucous laughter.

And there you have an example of when the data is not IID (specifically not independent) and therefore the Central Limit Theorem does not apply.

Related Question