Simulating a mixed Poisson gives me fewer events than expected.

exponential distributionpoisson processprobabilitypythonsimulation

Referencing section 5.4.3 of the book, Introduction to Probability Models by Sheldon Ross (10th edition), the Mixed Poisson is introduced where the rate, $\lambda$ of a Poisson process is itself drawn from a Gamma distribution. Let this Gamma distribution we're drawing $\lambda$ from be described by $L$. On page 352 we get:

$$E(N(t)|L) = Lt$$

And so we get the intuitive result:

$$E(N(t)) = E(E(N(t)|L)) = E(L)t$$

I'm trying to replicate this with simulation. Let's start with a simple Poisson process. We use Python to simulate inter-arrival exponential distributions and stop when a total time of 1000 units has elapsed. When the rate, $\lambda=5$ per the code below, we get about 5000 events once 1000 units of time have elapsed. No surprises here.

import numpy as np
period=1000
evnts = 0
t=0
lm = 5
while t<period:
    t_del = np.random.exponential(1/lm)
    t+=t_del
    evnts+=1
print(evnts)

Now, I want to extend this simulation to a mixed Poisson as described in Ross. Instead of simulating from a fixed exponential every time, I first simulate the $\lambda$ parameter from a Gamma distribution. I confirmed that the Gamma used does have a mean of 5. I then use that $\lambda$ to simulate an exponential distribution. And the number of events in 1000 units of time drops to about 3000 instead of 5000 as expected from the equation above. What am I missing? Why should the number of events be smaller?

import numpy as np
period=1000
evnts = 0
t=0
while t<period:
    theta = 0.5
    lm = np.random.gamma(5*theta,1/theta)
    t_del = np.random.exponential(1/lm)
    t+=t_del
    evnts+=1
print(evnts)

Best Answer

To simulate a mixed Poisson you first generate a single Gamma variable $L$. This value for $L$ gives you the $\lambda$ (rate) parameter, which you use for the duration of your Poisson process. In other words, you have to move the line

lm = np.random.gamma(5*theta,1/theta)

out of your while loop. When you do this, your event count will end up closer to the expected 5000.

Your current code is not simulating a Poisson process, since the interarrival times are not drawn from a single exponential distribution. Rather, each interarrival time is drawn from a different exponential, whose rate parameter $\lambda$ ranges from about $5-\sqrt{10}$ to $5+\sqrt{10}$ (these values are the mean $\pm$ standard deviation). Your code as written will generate many low values for $\lambda$ (which correspond to high interarrival times, which eat up a lot of the period) on virtually every run of your simulated point process; this is what causes the event count to drop. When you simulate properly, the low-$\lambda$ process will occur, but much less often, so low event counts are much less frequent.

Related Question