I am trying to develop a negative binomial model where the dependent variable is crash count, and the independent variables are traffic count and roadway length. Currently, with the below code, I get only one value of overdispersion parameter. But I want to get the overdispersion parameter that varies as a function of roadway length. Can you please tell me what should I change in the code? Thank you very much!
Crash_count<-
glm.nb(Total_crashes~LN(traffic_count) +
road_length, data=mydata)
Best Answer
I don't think you will be able to do that with
glm.nb()
but it is possible to do this is a Bayesian framework. I think there are multiple options on how to do this, but the most straightforward for someone without experience in Bayesian modelling will be to use thebrms
package, which allows the user to fit models using the same model syntax as R packages like lme4.Specifically, the model formula in the
brms
package would be the following, assuming you want to modelTotal_crashes
by bothtraffic_count
androad_length
and just the overdispersion parameter byroad_length
:See this post for an example of someone predicting the shape/phi parameter, which is the overdispersion, using variables in a model:
https://discourse.mc-stan.org/t/negative-binomial-shape-and-phi-parameters/8120/3
That post, in combination with the vignette I link below about fitting distributional models in brms, should give you a good start on fitting the model you'd like to fit
https://cran.r-project.org/web/packages/brms/vignettes/brms_distreg.html
EDIT: As others have pointed out in the comments, there are other packages that allow for this to be modelled.