I just watched this video: https://www.youtube.com/watch?v=Fk02TW6reiA
It shows a formula to calculate an answer for the following problem:
- There are 2 customers expected every 3 minutes in a store
- Therefore there are 6 customers expected every 9 minutes
- What is the likelihood of there being 4 or less in the store in 9 minutes?
- Answer is: P(0;6)+…+P(4;6) which is about 0.28
This makes sense and is well described. However, numpy treats the poisson distribution basically like a random number generator: http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.poisson.html
We can specify lambda as say 5 and how many numbers are desired (the second argument) and get a big list of integers:
>>> import numpy as np
>>> s = np.random.poisson(5, 10000)
>>> s
array([2, 4, 4, ..., 3, 4, 3])
>>> len(s)
10000
These seem like two totally different things. How do you get from using the Poisson formula to calculate the possibility of a certain number of events in a timeframe, to a list of seemingly random integers?
Best Answer
The formula $f$ is the probability mass function for the Poisson distribution. That formula, as explained in the video, can be used to calculate the probability of a given value under the assumed distribution. The related cumulative distribution function $F$ can be used to generate random numbers following the distribution:
More in this tutorial, which goes through a Poisson example using R. The Poisson PMF and CDF are available in scipy.