Binomial Distribution – Understanding Why Poisson Distribution is the Limiting Case of Binomial Distribution

binomial distributioncombinatoricsintuitionpoisson distributionprobability-calculus

In "Data Analysis" by D. S. Sivia, there is a derivation of the Poisson distribution, from the binomial distribution.

They argue that the Poisson distribution is the limiting case of the binomial distribution when $M\rightarrow\infty$, where $M$ is the number of trials.

Question 1: How can that argument intuitively be understood?

Question 2: Why is the large-$M$ limit of $\frac{M!}{N!(M-N)!}$ equal to $\frac{M^{N}}{N!}$, where $N$ is the number of successes in $M$ trials? (This step is used in the derivation.)

I will try a simple intuitive explanation. Record that for a binomial random variable $X \sim \text{Bin}(n,p)$ we have expectation is $n p$ and variance is $n p (1-p)$. Now think that $X$ records the number of events in a very large number $n$ of trials, each with a very small probability $p$, such that we are very close to $1-p=1$ (really $\approx$). Then we have $np=\lambda$ say, and $n p (1-p) \approx n p 1 =\lambda$, so the mean and variance are both equal to $\lambda$. Then remember that for a poisson distributed random variable, we always have mean and variance equal! That is at least a plausibility argument for the poisson approximation, but not a proof.

Then look at it from another viewpoint, the poisson point process on the real line. This is the distribution of random points on the line that we gets if random points occur according to the rules:

  1. points in disjoint intervals are independent
  2. probability of a random point in a very short interval is proportional to length of interval
  3. probability of two or more points in a very short interval is essentially zero.

Then the distribution of number of points in a given interval (not necessarily short) is Poisson (with parameter $\lambda$ proportional to length). Now, if we divide this interval in very many, equally very short subintervals ($n$), the probability of two or more points in a given subinterval is essentially zero, so that number will have, to a very good approximation, a bernolli distribution, that is, $\text{Bin}(1,p)$, so the sum of all this will be $\text{Bin}(n,p)$, so a good approximation of the poisson distribution of number of points in that (long) interval.

