Solved – Type I and Type II negative binominal distribution in zero inflated negative binominal (ZINB) model

aicnegative-binomial-distributionzero inflation

When is it appropriate to use a Type I versus Type II negative binominal distribution in a zero-inflated negative binominal distribution?

I've found a Similar question, but without an answer I can comprehend or determine if it relates to zero-inflated negative binominal distributions

In my dataset using R code, I have determind using AIC values that the zero-inflated negative binominal (ZINB) distribution provides the best fit, compared to other distribution models.

Using the R package glmmTMBi have specified ZINB models with both Type I and Type II:

library(glmmTMB);library(MuMIn)
m1 <- glmmTMB(dv ~ iv1 + iv2 + iv2,ziformula=~.,data=df,family=nbinom1)
m2 <- glmmTMB(dv ~ iv1 + iv2 + iv2,ziformula=~.,data=df,family=nbinom2)

When testing the models AIC values, the model with Type II provides a better fit

AICc(m1,m2)
   df     AICc
m1  9 528.3359
m2  9 527.7481

When using the zeroinfl function from the pscl

zm2 <-  zeroinfl(dv ~ iv1 + iv2 + iv2,data=df,dist="negbin")
> AICc(m2,zm2)
    df     AICc
m2   9 527.7481
zm2  9 527.7481

It yields the same AIC value as Type 2 NB (and the estimtes and p-values are nearly identical), So it seems that Type 2 is assumed in the zeroinfl function.

My data set models the use of drugs (as the dv) over the past 30 days.

Is it appropriate to use a Type II negative binominal distribution and why?.
Is it reasonable to justify this descision with AIC values?

Best Answer

The difference between these two model families is the relationship between mean and variance.

nbinom1 (also called quasi-poisson) variance = µ * phi

where µ is the mean and phi is the over-dispersion parameter

nbinom2 (the default negative binomial in most packages) variance = µ(1+µ/k) also written µ + (µ^2)/k

where µ is the mean and k is the over-dispersion parameter

When choosing between these the paper by VerHoef, J.M. & Boveng is very helpful as are pages 16 and 17 of Bolker et al 2012.

VerHoef, J.M. & Boveng say that AIC doesn't necessarily apply to quasi poisson models (nbinom1) and they are skeptical about comparing AIC and qAIC (an information criteria developed for quasi models) although you do see it done.

Instead they recommend plotting the observed values against the squared residuals. This plot can be very noisy so grouping samples with similar observed values together and making the equivalent plot for the groups is recommended. If this plot follows a linear trend it suggests quasi-poisson (nbinom1) is best whereas a quadratic trend argues for a negative binomial model (nbinom2).

If you have a decent number of samples and a finite number of possible combinations of explanatory variables you could form groups not based on response variables but on treatment combinations. This plot is demonstrated in Bolker et al 2012 (link in the references) along with code to generate the plot in R.

Ben Bolker, Mollie Brooks, Beth Gardner, Cleridy Lennert, Mihoko Minami, October 23, 2012, Owls example: a zero-inflated, generalized linear mixed model for count data. https://groups.nceas.ucsb.edu/non-linear-modeling/projects/owls/WRITEUP/owls.pdf

VerHoef, J.M. & Boveng, P.L., 2007. Quasi-Poisson Vs. Negative Binomial Regression: How Should We Model Overdispersed Count Data? Ecology, 88(11), pp.2766–2772.

Related Question