I am trying to understand the meaning of $\lambda$ in Poisson distributions. I know that it is the average rate of occurrences of the event, but I have not been able to fully understand what that means.
In "A First Course in Probability" by Sheldon Ross, the author says that a Poisson distribution may be used as an approximation for a binomial distribution with parameters $(n,p)$ when $n$ is large and $p$ is small enough so that $np$ is of moderate size.
- What does $np$ being of moderate size mean? What is considered as of moderate size?
- Why does $np$ have to be of moderate size? Why can a Poisson distribution not be used as an approximation for a binomial distribution if $np$ is too big or too small?
Also, in other books, I read that a Poisson distribution is the limiting case of a binomial distribution when $\lambda=np$ is constant as $n\to\infty$.
Under what conditions, is $\lambda$ constant as $n\to\infty$?
I am new to probability. If someone can provide the intuition behind Poisson distributions (specifically about $\lambda$), I would greatly appreciate it.
Best Answer
This is a fantastic question, as it gets at the heart of a common misconception. The answer to the question "Why does $np$ have to be of moderate size?" is that it does not! The only thing that is needed for Poisson approximation is that $p$ is small.
To make this precise, we need a way to measure the quality of a probabilistic approximation. If $X$ and $Y$ are random variables, we define the total variation distance between them to be $$ d_{\text{TV}}(X, Y)=\sup_{A} |P(X\in A)-P(Y\in A)| $$ If this distance is small, say it is at most $\epsilon$, then you can use the distribution of $Y$ to calculation the probability of an event involving $X$, and have an absolute error of at most $\epsilon$.
Therefore, as long as $p$ is small, the quality of the Poisson approximation is good.
This same idea is true in greater generality. For a Binomial distribution, there are $n$ independent and equiprobable events; we can drop the requirement of equiprobability. Let $E_1,\dots,E_n$ be independent events, with $P(E_i)=p_i$ for $i\in \{1,\dots,n\}$. If we let $X$ be the number of events which occur, and let $Z$ be a Poisson random variable with the same mean as $X$, meaning that $Z\sim \text{Poisson}(\lambda)$ with $\lambda=\sum_{i=1}^n p_i$, then
$$ d_\text{TV}(X, Z)\le \min\left\{\frac1{\lambda}\sum_{i=1}^n p_i^2, \,\,\sum_{i=1}^n p_i^2\right\}\tag2 $$ In the case where $p_1=\dots=p_n=p$, we recover $(1)$. For a proof of $(2)$, see Approximate Computations of Expectations, by Charles Stein. Equation $(2)$ appears on page 89.
Some comments on the semantic parts of your questions.
"Moderate" is not a precise term; it just means "neither larger nor small," but large and small are subjective.
When people say that $\lambda=np$ is constant as $n\to\infty$, it just means that $n$ is getting larger, and for each particular instance of $n$, the corresponding value of $p$ is equal to $\lambda/n$. Saying
means the exact same thing as
So, Binomial$(10,0.5)$ is pretty close to a Poisson$(5)$ distribution, Binomial$(50,0.1)$ is even closer, Binomial$(500,0.01)$ is extremely close, etc.