Simulate intervals data from a Negative Binomial point process.

poisson distributionpoisson processprobabilitysimulationstochastic-processes

In section 5.4.3 of the book on "Introduction to probability models", Ross explains the "Negative Binomial point process". This is obtained by mixing the $\lambda$ parameter of the Poisson process with a Gamma distribution. We now get a point process where the number of events in an interval of length $t$ is a Negative Binomial random variable with probability of success: $p=\frac{\theta}{\theta+t}$ and aiming to amass $m$ successes (here, $\theta$ and $m$ are the parameters of the Gamma that was used for the mixing). It's pretty clear how to simulate the number of events in an interval of length $t$. I however need more than that. I need the actual time-stamps of the point-events generated by this process. How do I go about generating them?


What I attempted:

The way I do this for a Poisson process is generating exponential random variables with the appropriate rate parameter as the inter-arrival times. Then, a simple summation of the inter-arrival times gives me the time-stamps at which the actual events occurred. I extended this approach to the mixed-Poisson by mixing the exponential with a Gamma (generate the rate from the Gamma, then simulate an exponential with that rate and repeat). This didn't produce the required point process. I know because even the mean of events in any given interval didn't match (was way lower than) the expected mean from the Negative Binomial. Also, the mean number of events was a function of where the interval was starting (tending to be higher if the interval was towards the start). This is in contradiction to the point process described in the book since it is said at the start of section 5.4.3 that such a process will have stationary increments.

Best Answer

Your description of how you simulated the process is ambiguous. I'm suspecting you've drawn a different gamma-distributed $\ \lambda\ $ and then a $\ \lambda$-negative exponentially distributed time interval for every successive time stamp. If that is so, then you wouldn't have been properly simulating the point process Ross describes in his section $5.4.3$, and it might explain why the number of events you saw occurring in any given interval was much smaller than you expected, and the failure of stationarity.

To simulate the point process described in Ross's section $5.4.3$, you should be drawing a single gamma-distributed $ \lambda\ $ and then generating all your time stamps by drawing each interarrival time according to the distribution $\ 1-e^{-\lambda t}\ $ with that same $\ \lambda\ $.

I did this $100$ times, with $100$ $\ \lambda s\ $ drawn from a gamma distribution with parameters $\ \theta=0.4\ $ and $\ m=2\ $, and generated $100$ time stamps for each $\ \lambda\ $, using the appropriate negative exponential distribution. The sample mean of the $\ \lambda s\ $ was $0.844$, thus not significantly different from the expected $\ m\theta=0.8\ $, and they ranged from a low of $0.098$ to a high of $2.37$. When I tallied the number of time stamps that occurred in each of the intervals $\ [0,5), [5,10), [10,15), [15,20)\ $ and $\ [20,25)\ $, for each of the $100$ point processes, I obtained the results summarised in the following table: \begin{array}{c|cccc} \text{interval}& [0,5)&[5,10)&[10,15)&[15,20)&[20,25)\\ \hline \text{lowest number}&0&0&0&0&0\\ \hline \text{highest number}&14&16&13&18&15\\ \hline \text{sample mean}&3.97&4.05&4.51&4.10&4.01\\ \hline \end{array} which agrees pretty well with theoretical expectations, the low numbers tending to occur for the point processes with the smaller values of $\ \lambda\ $, and the high numbers tending to occur in those with higher values, and the expected number of time stamps occurring within an interval of length $\ t\ $ being $\ m\theta t=4\ $ for $\ t=5\ $.

Related Question