Solved – How large does a Poisson distribution’s mean need to be to use normal distribution statistics

distributionsmeannormal distributionpoisson distributionvariance-stabilizing

As the mean of a Poisson distribution increases, the Poisson distribution approximates a normal distribution. I assume that once the Poisson mean becomes large enough, we can use normal distribution statistics. Therefore we can start saying things '68% of the distribution will lie within 1 standard deviation of the Poisson mean' once the mean of a Poisson distribution becomes large enough.

How large does the mean of a Poisson distribution need to be before we can start using normal distribution statistics?

Best Answer

Update: As @whuber has pointed out with his comments, a better way to look at this is by computing the true coverage probabilities for the Poisson. The simulation, while also with its uses, does not reveal the interesting pattern seen in the plot below.

poisplot

This was based on @whuber's code (see his first comment on this answer):

f <- function(mu) ppois(mu + sqrt(mu), mu) - ppois(mu - sqrt(mu), mu) ## computing coverage probabilities as a function of the true mean
curve(f, from = 1, to = 2000, n = 300, main = "True Coverage Probabilities for Poisson", xlab = expression(lambda), ylab = "Probability")
abline(h = 0.6827, col = "red") ## coverage prob. for a Normal RV

What this does: If $X \sim \textrm{Pois}(\lambda)$ then $E(X) = \lambda$ and $V(X) = \lambda$. This means that the interval in question is $I := (\lambda - \sqrt \lambda, \lambda + \sqrt \lambda)$. The function f computes $$ \mathbb P_\lambda(X \in I) = F_X(\lambda + \sqrt \lambda; \lambda) - F_X(\lambda - \sqrt \lambda; \lambda) $$ where in R $F_X(t; \lambda)$ is obtained via the ppois function.

Original answer: This is in no way a categorical answer but I thought you might like to see a simulation. Note that I'm using samples of size $n = 20000$ because you didn't mention that you cared about the sample size, so I wanted each sample to reflect asymptotic properties. plot

The simulation shows that the Poisson random variables (RVs) do not behave indistinguishably from the Normal RVs until around $\lambda \approx 1000$ with respect to the coverage rate and this choice of $n$. We can also see the variation in the coverage of random samples of Normal RVs even though they all exactly have the property that we are investigating at the population level. Note that in this simulation I compared a random sample to its sample mean and sample standard deviation rather than the population mean and population standard deviation. I chose to do so because I felt this to be more interesting for a discussion about the distribution of a statistic calculated from a sample.

Here's the code to make the plot.

set.seed(1)
lambda.seq <- round(seq(10, 2000, length = 300)) ## lambdas to try
res.norm <- res.pois <- numeric(length(lambda.seq)) ## these get the results
nsim <- 20000 ## number of observations at each iteration

for(i in 1:length(lambda.seq))
{
  sims <- rpois(nsim, lambda.seq[i]) ## simulating Poissons
  res.pois[i] <- mean(sims > mean(sims) - sd(sims) & sims < mean(sims) + sd(sims)) ## computing proportion of Poisson RVs within 1 SE of the mean

  sims <- rnorm(nsim, lambda.seq[i], lambda.seq[i]) ## simulating normals
  res.norm[i] <- mean(sims > mean(sims) - sd(sims) & sims < mean(sims) + sd(sims))
}

plot(res.pois ~ lambda.seq, pch = 19, col = "red", main = "1 SD Coverage Convergence of Poisson to Normal", xlab = expression(lambda), ylab = expression(paste("Proportion  within  ", mu, "  \u00b1  ", sigma, sep = "")))
points(res.norm ~ lambda.seq, pch = 19, col = "blue")
abline(h = .6827) ## this is the true coverage for a normal distribution
legend("topright", c("Poisson", "Normal"), pch = 19, col = c("red", "blue"), bty = "n")

Related Solutions

Solved – How does the sampling distribution of sample means approximate the population mean

I think you might be confusing the expected sampling distribution of a mean (which we would calculate based on a single sample) with the (usually hypothetical) process of simulating what would happen if we did repeatedly sample from the same population multiple times.

For any given sample size (even n = 2) we would say that the sample mean (from the two people) estimates the population mean. But the estimation accuracy -- that is, how good a job we've done of estimating the population mean based on our sample data, as reflected in the standard error of the mean -- will be poorer than if we had a 20 or 200 people in our sample. This is relatively intuitive (larger samples give better estimation accuracy).

We would then use the standard error to calculate a confidence interval, which (in this case) is based around the Normal distribution (we'd probably use the t-distribution in small samples since the standard deviation of the population is often underestimated in a small sample, leading to overly optimistic standard errors.)

In answer to your last question, no we don't always need a Normally distributed population to apply these estimation methods -- the central limit theorem indicates that the sampling distribution of a mean (estimated, again, from a single sample) will tend to follow a normal distribution even when the underlying population has a non-Normal distribution. This is usually appropriate for "bigger" sample sizes.

Having said that, when you have a non-Normal population that you're sampling from, the mean might not be an appropriate summary statistic, even if the sampling distribution for that mean could be considered reliable.

Solved – Normal approximation to the Poisson distribution

Suppose $X$ is Poisson with parameter $\lambda$, and $Y$ is normal with mean and variance $\lambda$. It seems to me that the appropriate comparison is between $\Pr(X = n)$ and $\Pr(Y \in [n-\frac12,n+\frac12])$. Here for simplicity I write $n = \lambda + \alpha \sqrt\lambda$, that is, we are interested when $n$ corresponds to $\alpha$ standard deviations from the mean.

So I cheated. I used Mathematica. So both $\Pr(X = n)$ and $\Pr(Y \in [n-\frac12,n+\frac12])$ are asymptotic to $$ \frac 1{\sqrt{2\pi \lambda}} e^{-\alpha^2/2} $$ as $\lambda \to \infty$. But their difference is asymptotic to $$ \frac{\alpha \left(\alpha ^2-3\right) e^{-\alpha ^2/{2}}}{6 \sqrt{2 \pi } \lambda } $$ If you plot this as a function of $\alpha$, you will get the same curve as is shown in the second to last figure in http://www.johndcook.com/blog/normal_approx_to_poisson/.

Here are the commands I used:

  n = lambda + alpha Sqrt[lambda];
  p1 = Exp[-lambda] lambda^n/n!;
  p2 = Integrate[1/Sqrt[2 Pi]/Sqrt[lambda] Exp[-(x-lambda)^2/2/lambda], {x, n-1/2, n+1/2}];
  Series[p1, {lambda, Infinity, 1}]
  Series[p2, {lambda, Infinity, 1}]

Also, with a bit of experimentation, it seems to me that a better asymptotic approximation to $\Pr(X = n)$ is $\Pr(Y \in [n-\alpha^2/6,n+1-\alpha^2/6])$. Then the error is $$ -\frac{\left(5 \alpha ^4-9 \alpha ^2-6\right) e^{-{\alpha ^2}/{2}} }{72 \sqrt{2 \pi } \lambda ^{3/2} } $$ which is about $\sqrt\lambda$ times smaller.

Best Answer

Related Solutions

Solved – How does the sampling distribution of sample means approximate the population mean

Solved – Normal approximation to the Poisson distribution

Related Question