I'm trying to describe in words why I used a zero-inflated negative binomial regression instead of an negative binomial regression:
To model my data I used a negative binomial regression. However, as my response variable included a high proportion of zeros (more than would be expected under the negative binomial distribution), a negative binomial regression did not fit my data well. More specifically, as the negative binomial regression was attempting to account for the high number of zeros and the counts simultaneously, the predicted values were overly the biased towards the zeros and the residual variation was high. In an attempt to correct these issues, I used a zero-inflated negative binomial regression. The zero-inflated negative binomial regression specified a model for the zeros and a model for the counts. This model reduced the residual variation because the zeros were modelled separately to the counts and therefore the predicted values for the counts were not weighted too heavily in favour of the zeros.
Could people comment on/edit/correct my justification?
Best Answer
I think you're on the right track. Zero-inflated models allow you to accommodate values that happen to be zero (but could plausibly take other values) and and certain zeros that are fixed at zero. You may want to provide specific examples of how both situations apply to your data. For example,
Adding these specifics helps justify your choice of model beyond "well, it kinda fits better." You want to convince people that your data would be well-fit by a negative binomial model if you could somehow magically remove the certain zeros.