Comparing Quasibinomial GLMs in R – A Detailed Guide

binomial distributiongeneralized linear modelquasi-likelihoodr

I have some data that are proportions that I am trying to fit a glm to. As the data are proportions someone suggested that the quasibinomial is the correct family of glm to go for. I am now trying to do the model selection. However, as R does not give AIC scores for quasi models I'm at a bit of a loss as to what to do. Is there an alternative? I've read somewhere that likelihoods aren't good for quasi model, but are there any other options?

Best Answer

If your response variable is produced by taking the fraction of $a \textrm{ and } b$, $\{a,b\} \in \mathbb{R} \mid \{a,b\} > 0$ (i.e $a$ and $b$ are non-zero, positive, continuous), then it is reasonable to model it as being distributed according to the Beta distribution. In such case you should perform a beta regression, implemented in the betareg package in R. You can then compare your models with stats::AIC.

If $a$ represents counts and $b$ trials, you should rather use a logistic regression which uses the binomial distribution as a model. Using the quasi-binomial distribution as a model is useful if your data exhibit more (or less) variance than that expected from a binomial model. If you choose to use a quasi-binomial model, the package AICcmodavg can provide you with the quasilikelihood counterpart of AIC. However you should proceed with caution as this metric may be too sensitive with departures from the model (Wang et al. 2015).


For more information on beta vs logistic regression see: https://stats.stackexchange.com/a/29042/97671

For more discussion on when to use a quasi-binomial vs a binomial model see:
What is quasi-binomial distribution (in the context of GLM)?

Reference:
Wang, Y., Murphy, O., Turgeon, M., Wang, Z., Bhatnagar, S. R., Schulz, J., and Moodie, E. E. M. (2015)
The perils of quasi-likelihood information criteria. STAT, 4: 246–254. 10.1002/sta4.95