I have a data set that I'd expect to follow a Poisson distribution, but it is overdispersed by about 3-fold. At the present, I'm modelling this overdispersion using something like the following code in R.
## assuming a median value of 1500
med = 1500
rawdist = rpois(1000000,med)
oDdist = rawDist + ((rawDist-med)*3)
Visually, this seems to fit my empirical data very well. If I'm happy with the fit, is there any reason that I should be doing something more complex, like using a negative binomial distribution, as described here? (If so, any pointers or links on doing so would be much appreciated).
Oh, and I'm aware that this creates a slightly jagged distribution (due to the multiplication by three), but that shouldn't matter for my application.
Update: For the sake of anyone else who searches and finds this question, here's a simple R function to model an overdispersed poisson using a negative binomial distribution. Set d to the desired mean/variance ratio:
rpois.od<-function (n, lambda,d=1) {
if (d==1)
rpois(n, lambda)
else
rnbinom(n, size=(lambda/(d-1)), mu=lambda)
}
(via the R mailing list: https://stat.ethz.ch/pipermail/r-help/2002-June/022425.html)
Best Answer
for overdispersed poisson, use the negative binomial, which allows you to parameterize the variance as a function of the mean precisely. rnbinom(), etc. in R.