Solved – Normal approximation to the Poisson distribution

approximationnormal distributionpoisson distribution

Here in Wikipedia it says:

For sufficiently large values of $λ$, (say $λ>1000$), the normal distribution with mean $λ$ and variance $λ$ (standard deviation $\sqrt{\lambda}$), is an excellent approximation to the Poisson distribution. If $λ$ is greater than about 10, then the normal distribution is a good approximation if an appropriate continuity correction is performed, i.e., $P(X ≤ x),$ where (lower-case) $x$ is a non-negative integer, is replaced by $P(X ≤ x + 0.5).$

$F_\mathrm{Poisson}(x;\lambda) \approx F_\mathrm{normal}(x;\mu=\lambda,\sigma^2=\lambda)$

Unfortunately this isn't cited. I want to be able to show / prove this with some rigour. How can you actually say the normal distribution is a good approximation when $\lambda > 1000$, how do you quantify this 'excellent' approximation, what measures were used?

The furthest I've got with this is here where John talks about using the Berry–Esseen theorem and approximates the error in the two CDFs. From what I can see he does not try any values of $\lambda \geq 1000$.

Best Answer

Suppose $X$ is Poisson with parameter $\lambda$, and $Y$ is normal with mean and variance $\lambda$. It seems to me that the appropriate comparison is between $\Pr(X = n)$ and $\Pr(Y \in [n-\frac12,n+\frac12])$. Here for simplicity I write $n = \lambda + \alpha \sqrt\lambda$, that is, we are interested when $n$ corresponds to $\alpha$ standard deviations from the mean.

So I cheated. I used Mathematica. So both $\Pr(X = n)$ and $\Pr(Y \in [n-\frac12,n+\frac12])$ are asymptotic to $$ \frac 1{\sqrt{2\pi \lambda}} e^{-\alpha^2/2} $$ as $\lambda \to \infty$. But their difference is asymptotic to $$ \frac{\alpha \left(\alpha ^2-3\right) e^{-\alpha ^2/{2}}}{6 \sqrt{2 \pi } \lambda } $$ If you plot this as a function of $\alpha$, you will get the same curve as is shown in the second to last figure in http://www.johndcook.com/blog/normal_approx_to_poisson/.

Here are the commands I used:

  n = lambda + alpha Sqrt[lambda];
  p1 = Exp[-lambda] lambda^n/n!;
  p2 = Integrate[1/Sqrt[2 Pi]/Sqrt[lambda] Exp[-(x-lambda)^2/2/lambda], {x, n-1/2, n+1/2}];
  Series[p1, {lambda, Infinity, 1}]
  Series[p2, {lambda, Infinity, 1}]

Also, with a bit of experimentation, it seems to me that a better asymptotic approximation to $\Pr(X = n)$ is $\Pr(Y \in [n-\alpha^2/6,n+1-\alpha^2/6])$. Then the error is $$ -\frac{\left(5 \alpha ^4-9 \alpha ^2-6\right) e^{-{\alpha ^2}/{2}} }{72 \sqrt{2 \pi } \lambda ^{3/2} } $$ which is about $\sqrt\lambda$ times smaller.