Comparing Quasibinomial GLMs in R – A Detailed Guide

binomial distributiongeneralized linear modelquasi-likelihoodr

I have some data that are proportions that I am trying to fit a glm to. As the data are proportions someone suggested that the quasibinomial is the correct family of glm to go for. I am now trying to do the model selection. However, as R does not give AIC scores for quasi models I'm at a bit of a loss as to what to do. Is there an alternative? I've read somewhere that likelihoods aren't good for quasi model, but are there any other options?

Best Answer

If your response variable is produced by taking the fraction of $a \textrm{ and } b$, $\{a,b\} \in \mathbb{R} \mid \{a,b\} > 0$ (i.e $a$ and $b$ are non-zero, positive, continuous), then it is reasonable to model it as being distributed according to the Beta distribution. In such case you should perform a beta regression, implemented in the betareg package in R. You can then compare your models with stats::AIC.

If $a$ represents counts and $b$ trials, you should rather use a logistic regression which uses the binomial distribution as a model. Using the quasi-binomial distribution as a model is useful if your data exhibit more (or less) variance than that expected from a binomial model. If you choose to use a quasi-binomial model, the package AICcmodavg can provide you with the quasilikelihood counterpart of AIC. However you should proceed with caution as this metric may be too sensitive with departures from the model (Wang et al. 2015).

For more information on beta vs logistic regression see: https://stats.stackexchange.com/a/29042/97671

For more discussion on when to use a quasi-binomial vs a binomial model see:
What is quasi-binomial distribution (in the context of GLM)?

_{Reference:

Wang, Y., Murphy, O., Turgeon, M., Wang, Z., Bhatnagar, S. R., Schulz, J., and Moodie, E. E. M. (2015)

The perils of quasi-likelihood information criteria. STAT, 4: 246–254. 10.1002/sta4.95}

Related Solutions

Solved – Perfect separation error message for glm with binomial but not with quasibinomial family

That sounds strange, I would guess it is a numerical coincidence. The only difference in R's glm between binomial and quasibinomial family is in the calculation of standard errors, the estimation process is exactly the same. Or, it might be that the difference in calculation of standard errors cause differences in the criteria for declaring convergence. Anyhow, you should not trust the model for hypothesis testing, the standard errors (for both binomial and quasibinomial case) is bogus. See my answer here Why does logistic regression become unstable when classes are well-separated? for some ideas of what to do in this case of separation.

Solved – Accounting for overdispersion in binomial glm using proportions, without quasibinomial

Overdispersion occurs for a number of reasons, but often the case of presence/absence data is because of clustering of observations and correlations between observations.

Taken from Brostrom & Holmberg (2011) Generalised Linear Models with Clustered Data: Fixed and random effects models with glmmML

"Generally speaking, a random effects model is appropriate if the observed clusters may be regarded as a random sample from a (large, possibly infinite) pool of possible clusters. The observed clusters are of no practical interest per se, but the distribution in the pool is. Or this distribution is regarded as a nuisance that needs to be controlled for."

https://cran.r-project.org/web/packages/eha/vignettes/glmmML.pdf

library(lme4) 
library(RVAideMemoire)
Data$obs <- factor(formatC(1:nrow(Data), flag="0", width = 3))
model.glmm <- glmer(cbind(number_pres,number_abs) ~ Var1+Var2+Var3+Var4...+
(1|obs),family = binomial (link = logit),data = Data) 
overdisp.glmer(model.glmm) #Overdispersion for GLMM

Best Answer

Related Solutions

Solved – Perfect separation error message for glm with binomial but not with quasibinomial family

Solved – Accounting for overdispersion in binomial glm using proportions, without quasibinomial

Related Question