Solved – Generating over-dispersed counts data with serial correlation

distributionspoisson distributionrsimulationtime series

Could anyone provide some suggestions on how to generate over-dispersed counts data with serial correlations? I am using R software to conduct a simulation study. Any references on this subject will be much appreciated.

Thanks for your help.

Best Answer

A standard way of generating overdispersed count data is to generate data from a Poisson distribution with a random mean: $Y_i\sim Poisson(\lambda_i)$, $\lambda_i \sim F$. For example, if $\lambda_i$ has a Gamma distribution, you will get the negative binomial distribution for $Y$.

You can easily impose serial correlation by imposing correlation on the $\lambda_i$'s. For example, you could have $\log\lambda_i \sim AR(1)$. Implemented in R:

N <- 100
rho <- 0.6
log.lambda <- 1 + arima.sim(model=list(ar=rho), n=N)
y <- rpois(N, lambda=exp(log.lambda))
> cor(head(y,-1), tail(y,-1))
[1] 0.4132512
> mean(y)
[1] 4.35
> var(y)
[1] 33.4015

Here $\lambda_i$'s come from a normal distribution, so the marginal distribution is not a classic distribution, but you could get more creative. Also note that the correlation of the $y$'s does not equal to rho, but it is some function of it.

Related Question