Related to glm()
in R, I saw a few post recommending modeling underdispersed data using the Conway–Maxwell–Poisson distribution, specifically with the R package CompGLM
, however, I'm not sure I saw anybody confirming that the quasi-poisson cannot be used. Therefore, I ask: why not use quasi-poisson in glm
for underdispersed data? After all, isn't the idea of quasi-poisson to go beyond the assumption that variance and mean are equal ? (and in the case of underdispersion, there are not equal).
Basically, I am running a glm(y ~ x, family=poisson)
where x is a categorical variable and I am getting
Null deviance: 67.905 on 519 degrees of freedom
Residual deviance: 59.584 on 507 degrees of freedom
Which strongly suggest underdispersion and I am therefore leaning towards a quasi-poisson solution.
Best Answer
Quasi-likelihood theory is as valid with underdispersed data as it is with overdispersed data, so you could just go that way.
But, I would be careful, context matters a lot. While overdispersion is quite common, and is easily explained by simple mechanisms, that is not the case with underdispersion! For instance, extra, unmodeled (or unobserved) variation/inhomogeneities leads to overdispersion, but can never produce underdispersion. Causes for underdispersion are more difficult to come by, they usually have to do with a lack of independence. For one example see Causes for Underdispersion in Poisson Regression. One common cause of lack of independence is competion, an example I just come by is counts of territorial birds!
Some posts dealing with practical matters when modeling with underdispersion is
GLM for proportional data and underdispersion,
Overdispersion and Underdispersion in Negative Binomial/Poisson Regression,
Is there a common underdispersed discrete distribution with unbounded support for general mean and variance?,
Are these data underdispersed? If so, what mechanisms may explain this?