Solved – Calculate interaction effect confidence intervals in zero-inflated poisson regression

confidence intervalinteractionrzero inflation

I'm conducting a zero-inflated Poisson regression using the pscl package in R. I've included interaction terms but am having an issue with interpretation. I am assuming an additive effect and summing coefficients (x + y + xy) but am not sure what to do about the confidence intervals or p-values. From what I understand, these have to be re-estimated, probably with some bootstrapping method, but I can't figure out how to do this.

The main issue is that one of my effects reverses in interaction. I've provided a simplified version of the code below (sorry, I can't share the data). Here's a brief description of the scenario: when doctors discuss re-injury prevention with their patients, time off work increases, but in interaction with a low-stress setting, it reduces time off work.

So my question is:

  • How do you calculate interaction effect confidence intervals and p-values using a zeroinfl object?
  • Is the process for calculating CIs and p-values different between the zero and count models?

Example code would be greatly appreciated!

model <- zeroinfl(TimeLoss ~ PrevDisc + LowStress + 
                PrevDisc * LowStress, 
              data = Doctors)

Best Answer

If PrevDisc and LowStress are binary variables (this is my impression from your description), then the interaction model simply corresponds to four different zero-inflated Poisson distributions: one for each combination of PrevDisc and LowStress.

When using the formula TimeLoss ~ PrevDisc * LowStress, treatment coding of the coefficients is used, i.e., the four parameters (in each model part) are coded as an intercept, two main effects, and an interaction effect. This coding facilitates judging whether or not the interaction effect is significant.

If you want to assess the PrevDisc effect separately for the two LowStress groups, then you can use a nested coding of the coefficients via the formula TimeLoss ~ LowStress/PrevDisc.

In either case, all inference can be done "in the usual way", i.e., summary() and confint() for marginal Wald tests and Wald confidence intervals. But also lrtest() (from lmtest) for nested model comparisons with the likelihood ratio test or AIC()/BIC(). (Generalized) linear hypotheses can be tests with linearHypothesis() (from car) and glht() (from multcomp) respectively.

The difference in interpretation between the two model parts is that the count model is a log-linear model for the mean in the count component. The zero model is log-linear for the odds of zero inflation (i.e., an observation from the point mass component).