I am running a negative binomial regression using statsmodels on Python.
My DV is count data and zero-inflated. The one IV in my model is categorical and I have no constant term, and my understanding is that the intercept of the regression is the coefficient for my reference group.
What I am seeing in the result is that the coeffient for intercept is negative and significant, which doesn't make sense to me as DV is always a positive value. As a precaution I dropped the extreme values in my data (anything above 99%). Below is the description of my DV:
count 1.982101e+06
mean 1.369949e+00
std 4.218949e+00
min 0.000000e+00
25% 0.000000e+00
50% 0.000000e+00
75% 0.000000e+00
max 3.200000e+01
The python code I am running is :
form = "DV ~ C(IV , Treatment(reference=0))"
est = negativebinomial( form, data = df ).fit()
I should add that the problem is only with one level of the IV. If I run the regression without that level, I don't get negative prediction of DV.
I hope I have provided enough information to explain my question and I appreciate any help.
Best Answer
The coefficients for a negative binomial regression are on the log-scale. So the intercept is the log-rate for observations at the reference-levels of categorical factors and at level 0 of continuous covariates.
For example, an intercept of -1 corresponds to a rate of 0.37 ($=\exp(-1)$).