What’s wrong with the simulation of the Central Limit Theorem

While there is code in this question, I suspect the answer will be mathematical.

I am trying to create a numerical simulation of the Central Limit Theorem (CLT). My understanding is that if $S_n = \frac{X_1 + X_2 + \dots + X_n}{n}$ for $n$ i.i.d. random variables $\{X_1, X_2, \dots X_n\}$, then the CLT states that

$$
\lim_{n \rightarrow \infty} \sqrt{n}(S_n – \mu) \stackrel{\text{dist}}{\rightarrow} \mathcal{N}(0, \sigma^2)
$$

where $X_i$ has expectation $\mu$ and variance $\sigma^2$. So one way to empirically explore this is to plot both the distribution $\mathcal{N}(0, \sigma^2)$ and the distribution from $n$ random samples of the random variable $\sqrt{n}(S_n – \mu)$. When I do that, I get this figure:

My code is below. In my simulation, $X_i \sim \text{Uniform}(0, 1)$, meaning that $\mu = \frac{1}{2}(0 – 1)$ and $\sigma^2 = \frac{1}{12}(1 – 0)$. I first generate 2000 replicates of $\sqrt{n} (S_n – \mu)$ and plot that distribution. I then plot the theoretical distribution that I expect this distribution to converge to, namely $\mathcal{N}(0, \sigma^2)$. Unfortunately, the two distributions look way off.

I suspect I have a "mathematical bug" rather than an programming one. In summary: What is wrong with my thinking about the CLT that makes this simulation wrong?

Simulation code

import numpy as np
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
from   scipy.stats import norm as normal

a = 0
b = 1
n = 100
reps = 2000
mu_lim  = 1/2.  * (a + b)
var_lim = 1/12. * (b - a)**2

rvs = []
for _ in range(reps):
    Sn = np.random.uniform(a, b, n).sum() / n
    rv = np.sqrt(n) * (Sn - mu_lim)
    rvs.append(rv)

_, bins, _ = plt.hist(rvs, bins=50, density=True)
(mu_actual, var_actual) = normal.fit(rvs)
y = normal.pdf(bins, mu_actual, var_actual)
plt.plot(bins, y, 'r--', label='Actual N(., .) for n = %s' % n)

# Plot distribution to converge to
x = np.linspace(-3*var_lim, 3*var_lim, 100)
plt.plot(x, normal.pdf(x, 0, var_lim), label='N(0, s^2)')

plt.legend()
plt.show()

What’s wrong with the simulation of the Central Limit Theorem

Simulation code

Best Answer

Related Question

Simulation code

Best Answer

Related Solutions

[Math] Central Limit Theorem (multivariate)

My simulation of the Central Limit Theorem does not converge to correct value

Related Question