Solved – Poisson Regression and Negative Binomial regression results interpretation

negative-binomial-distributionpoisson distributionrregressionregression coefficients

I'm using Poisson Regression and Negative Binomial regression to estimate temporal trends. My understanding is that the coefficients are in log scale and they have to be translated to data-unit (count per time [ year, month…]) by multiplying them by 100. Is it correct?

Negative binomial example

> library("MASS")
> Y= DF$Counts
> X= DF$ Years
> Nig<- glm.nb(Y~X) 
> summary(Nig)

Call:
glm.nb(formula = Y ~ X, init.theta = 6.190108641, link = log)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-2.19350  -0.81948  -0.06559   0.47013   1.85608  

Coefficients:
              Estimate Std. Error z value Pr(>|z|)
(Intercept) -18.316582  19.892078  -0.921    0.357
X             0.010564   0.009947   1.062    0.288

(Dispersion parameter for Negative Binomial(6.1901) family taken to be 1)

    Null deviance: 32.207  on 29  degrees of freedom
Residual deviance: 31.059  on 28  degrees of freedom
AIC: 210.93

Number of Fisher Scoring iterations: 1


              Theta:  6.19 
          Std. Err.:  2.23 

 2 x log-likelihood:  -204.928 

The slope (trend) = 0.01056 on Log scale and to change it to count per year, it has to be multiplied by 100. So the trend = 1.056 count per year

Best Answer

No, these are on the log-scale. I.e. you want to take the exponential. E.g. exp(0.010564) the rate changes by a factor of 1.01 per time unit. If you want to translate that into a percentage increase (or decrease), you subtract 1 and multiply by 100 (i.e. (exp(0.010564)-1)*100), so in the example that is an increase by about 1% per time unit. Of course, there is also considerable uncertainty around the estimated slope, so you may want to look at the confidence intervals.

Additionally, I would check that your model has actually converged and/or that there were no warnings and/or look into rescaling your X variable. The intercept of -18.316582 is an absurdly small mean rate and there could be different reasons for that. Perhaps you put in years (such as 1950, 2000 and so on)? If so, that intercept refers to the year 0, while if you gave X as years since the first year in your dataset, the numbers might look less weird. Alternatively, there may just be some convergence problem.