Solved – How is the poisson distribution a distribution? It seems more like a formula

numpypoisson distribution

I just watched this video: https://www.youtube.com/watch?v=Fk02TW6reiA

It shows a formula to calculate an answer for the following problem:

  • There are 2 customers expected every 3 minutes in a store
  • Therefore there are 6 customers expected every 9 minutes
  • What is the likelihood of there being 4 or less in the store in 9 minutes?
  • Answer is: P(0;6)+…+P(4;6) which is about 0.28

This makes sense and is well described. However, numpy treats the poisson distribution basically like a random number generator: http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.poisson.html

We can specify lambda as say 5 and how many numbers are desired (the second argument) and get a big list of integers:

>>> import numpy as np
>>> s = np.random.poisson(5, 10000)
>>> s
array([2, 4, 4, ..., 3, 4, 3])
>>> len(s)
10000

These seem like two totally different things. How do you get from using the Poisson formula to calculate the possibility of a certain number of events in a timeframe, to a list of seemingly random integers?

Best Answer

The formula $f$ is the probability mass function for the Poisson distribution. That formula, as explained in the video, can be used to calculate the probability of a given value under the assumed distribution. The related cumulative distribution function $F$ can be used to generate random numbers following the distribution:

  1. Use the CDF to partition the interval $(0,1)$ into subintervals: $(0, F(x_1))$, $(F(x_1), F(x_2))$, $etc...$
  2. Generate random numbers on the interval $(0,1)$ and see which bin they fall into.

More in this tutorial, which goes through a Poisson example using R. The Poisson PMF and CDF are available in scipy.

Related Question