Negative Binomial Distribution – How to Develop a Negative Binomial Model with Variable Overdispersion Parameter

generalized linear modelnegative-binomial-distributionoverdispersion

I am trying to develop a negative binomial model where the dependent variable is crash count, and the independent variables are traffic count and roadway length. Currently, with the below code, I get only one value of overdispersion parameter. But I want to get the overdispersion parameter that varies as a function of roadway length. Can you please tell me what should I change in the code? Thank you very much!

Crash_count<- 
    glm.nb(Total_crashes~LN(traffic_count) + 
    road_length, data=mydata)

Best Answer

I don't think you will be able to do that with glm.nb() but it is possible to do this is a Bayesian framework. I think there are multiple options on how to do this, but the most straightforward for someone without experience in Bayesian modelling will be to use the brms package, which allows the user to fit models using the same model syntax as R packages like lme4.

Specifically, the model formula in the brms package would be the following, assuming you want to model Total_crashes by both traffic_count and road_length and just the overdispersion parameter by road_length:

Crash_count <- brm(bf(Total_crashes~LN(traffic_count) + road_length,
                      shape ~ road_length),
                   data=mydata,
                   family = negbinomial())

See this post for an example of someone predicting the shape/phi parameter, which is the overdispersion, using variables in a model:

https://discourse.mc-stan.org/t/negative-binomial-shape-and-phi-parameters/8120/3

That post, in combination with the vignette I link below about fitting distributional models in brms, should give you a good start on fitting the model you'd like to fit

https://cran.r-project.org/web/packages/brms/vignettes/brms_distreg.html

EDIT: As others have pointed out in the comments, there are other packages that allow for this to be modelled.