R – How to Interpret Rate Ratios and Odds Ratios from Zero-Altered/Hurdle Negative Binomial Model

I am having a really hard time interpreting results from a zero altered negative binomial model. This is related to my question but some things have changed. I am doing this in R.

Quick background on my data…I am trying to model bythotrephes prey found in fish stomachs. My data is zero inflated and overdispersed. After doing model diagnostics/selection, decided that ZANB/hurdle model was best.

Based on AICc, the most parsimonious model is:

hurdle(Byths ~ depth * Season, 
         data = dietGLM, dist = "negbin")

where Byths == bythotrephes count; depth is the max. depth of where the fish was collected (max. trawl depth), and Season is a categorical variable (Pre Hypoxia, Peak Hypoxia, and Post Hypoxia). I didn't scale nor transform anything for the model.

    Call:
hurdle(formula = Byths ~ depth * Season, data = dietGLM, dist = "negbin")

Pearson residuals:
    Min      1Q  Median      3Q     Max 
-0.8326 -0.5676 -0.2256  0.1534  4.7221 

Count model coefficients (truncated negbin with log link):
                 Estimate Std. Error z value Pr(>|z|)    
(Intercept)       5.88537    0.62991   9.343  < 2e-16 ***
depth            -0.01528    0.01699  -0.899 0.368510    
SeasonPeak       -3.03182    0.86296  -3.513 0.000443 ***
SeasonPost       -2.74276    1.95642  -1.402 0.160937    
depth:SeasonPeak  0.05199    0.02165   2.401 0.016339 *  
depth:SeasonPost  0.06410    0.03995   1.604 0.108609    
Log(theta)        0.13513    0.19772   0.683 0.494335    
Zero hurdle model coefficients (binomial with logit link):
                 Estimate Std. Error z value Pr(>|z|)    
(Intercept)       4.02258    1.10901   3.627 0.000287 ***
depth            -0.09761    0.02534  -3.851 0.000117 ***
SeasonPeak       -2.55477    1.65554  -1.543 0.122790    
SeasonPost       -3.94742    3.35201  -1.178 0.238944    
depth:SeasonPeak  0.08342    0.03684   2.264 0.023569 *  
depth:SeasonPost  0.09163    0.06704   1.367 0.171698    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Theta: count = 1.1447
Number of iterations in BFGS optimization: 17 
Log-likelihood: -344.6 on 13 Df

Then I did:

exp(cbind(Exponentiated_Odds_Ratio=coef(h5), confint(h5)))

which gives..

                           Exponentiated_Odds_Ratio        2.5 %       97.5 %
count_(Intercept)                  359.73629541 1.046652e+02 1236.4199806
count_depth                          0.98483495 9.525747e-01    1.0181878
count_SeasonPeak                     0.04822789 8.886849e-03    0.2617271
count_SeasonPost                     0.06439212 1.391624e-03    2.9795012
count_depth:SeasonPeak               1.05336831 1.009601e+00    1.0990332
count_depth:SeasonPost               1.06619778 9.858986e-01    1.1530371
zero_(Intercept)                    55.84473421 6.353222e+00  490.8744754
zero_depth                           0.90700275 8.630488e-01    0.9531952
zero_SeasonPeak                      0.07770974 3.028865e-03    1.9937515
zero_SeasonPost                      0.01930438 2.706583e-05   13.7686161
zero_depth:SeasonPeak                1.08699475 1.011268e+00    1.1683924
zero_depth:SeasonPost                1.09595488 9.610133e-01    1.2498444

Putting everything together (for me it is easier to look at this, so I hope this is allowed…):

I've read sooo many papers and book chapters on hurdle models and OR and I am still not understanding how to clearly explain my results. I have a limited stats background so that doesn't help.

This is what I have so far but honestly it doesn't make sense to me….

For depth exp(-0.10) = 0.091 (OR), does this mean that for one unit increase in depth, the odds of my fish of interest consuming bythotrephes increases by a factor of 0.91 ? What exactly does this mean?
For the count part of model: Fish are RR = 0.05; exp(-3.03) less likely to consume bythotrephes during 'SeasonPeak' compared to 'SeasonPre'?

I am not sure how to interpret the categorical variables, how to put in words the results between RR and OR, and how to interpret interaction terms.

I have looked at other questions: this, this, this, and many others on similar topics.

I am happy to include more information/data if it helps understand my question.
The simpler you can explain this the better. Try to explain this as if you were explaining it to a kid 🙂

Best Answer

The zero hurdle model coefficients are interpreted exactly as you would interpret coefficients from a logistic regression predicting not 0s vs. 0s, where not 0s are the event and 0s are the reference category. So, the odds ratio of 0.91 for Depth means that for every one unit increase in depth, the odds of consuming bythotrephes change by .91, i.e., the odds decrease by 9%. That is, the greater the depth, the less likely they are to consume bythotrephes.

(Note that because of the interaction, this relationship refers to in SeasonPre.)

For the count part, the exponentiated coefficients refer to incidence (i.e., count) ratios among those that consume bythotrephes. So, the incidence ratio of .05 means that among those that consume any bythotrephes, fish consume 5% as many bythotrephes in SeasonPeak as they do in SeasonPre (at a depth of 0). The positive coefficient for the depth x SeasonPeak interaction means this number gets less extreme the deeper they are.

Best Answer

Related Solutions

Solved – Zero inflated negative binomial glm model

Related Question