Solved – normal vs negative binomial regression results

generalized linear modelnegative-binomial-distributionnormal distribution

The pictures show the same response-covariate association. The response represent a large count variable (number of sperm). When I fit a normal linear regression the beta coefficient is significantly different from 0, whereas when I fit a negative binomial model the association is not significant any more.
It seems that there is an influential observation and if the fits are similar the confidence interval are much more larger in the negative binomial model.

How can do you justify this difference?

enter image description here

Best Answer

Ordinary Least Squares regression ("normal" linear regression) makes certain assumptions about the data. Here, the most salient assumptions are:

  • Observed values can take any real number
  • The errors are normally distributed, i.e. your observed values are normally distributed around the expected value

General linear regression with a negative binomial distribution makes different assumptions about your data. The negative binomial distribution is discrete distribution, which makes it useful for modeling count data.

While it is not necessary for the errors to actually be normally distributed in order to perform OLS, but making inferences from the parameters does.

When the mean of the associated negative binomial distribution is large, it approximates a normal distribution. This is the best case. And, in some instances, the parameter estimates and significance for both models would likely be the same.

The fact that your models disagree most likely points to the OLS model being inappropriate for the circumstances. You find that the beta coefficient is significantly different from zero because you are assuming that the distribution is normal when, in fact, it is not. You could try plotting a histogram of the model residuals to convince yourself.