Solved – How to interpret coefficients in a Poisson regression with interaction terms

generalized linear modelpoisson distributionr

This question is a prolongation of this question: How to interpret coefficients in a Poisson regression?

If we follow the (almost) exact same routine, but we add correlation between the variablese treatment and improved (just for the sake of my question, which is interpreting the output), we get:

treatment     <- factor(rep(c(1, 2), c(43, 41)), 
                        levels = c(1, 2),
                        labels = c("placebo", "treated"))
improved      <- factor(rep(c(1, 2, 3, 1, 2, 3), c(29, 7, 7, 13, 7, 21)),
                        levels = c(1, 2, 3),
                        labels = c("none", "some", "marked"))    
numberofdrugs <- rpois(84, 10) + 1    
healthvalue   <- rpois(84, 5)   
y             <- data.frame(healthvalue, numberofdrugs, treatment, improved)
test          <- glm(healthvalue~numberofdrugs+treatment+improved + treatment:improved, y, family=poisson)
summary(test)

Note the $\textbf{ treatment:improved}$ term I added inside the glm function.

Now, we get the following output:

    Call:
glm(formula = healthvalue ~ numberofdrugs + treatment + improved + 
    treatment:improved, family = poisson, data = y)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-2.9261  -0.8733  -0.0296   0.5473   2.3358  

Coefficients:
                                 Estimate Std. Error z value Pr(>|z|)    
(Intercept)                      1.553051   0.184229   8.430   <2e-16 ***
numberofdrugs                    0.004298   0.014242   0.302   0.7628    
treatmenttreated                 0.007399   0.149440   0.050   0.9605    
improvedsome                     0.358897   0.164891   2.177   0.0295 *  
improvedmarked                  -0.178360   0.203756  -0.875   0.3814    
treatmenttreated:improvedsome   -0.330336   0.265310  -1.245   0.2131    
treatmenttreated:improvedmarked  0.050617   0.260203   0.195   0.8458    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 97.805  on 83  degrees of freedom
Residual deviance: 89.276  on 77  degrees of freedom
AIC: 383.29

Number of Fisher Scoring iterations: 5

If we ignore what seems to be insignificant coefficients, I can ask my question:

I understand that, as in the original post, treatment=placebo and improved=none is the base level for those variables, and thus are set to zero. My question is, why does it not exist any interaction terms with the base lavels for treatment=placebo and improved=none?

I thought setting the base levels to zero was just a construct, and in my mind there should still exist correlation between them…(?)

Best Answer

You write

I understand that, as in the original post, treatment=placebo and improved=none is the base level for those variables, and thus are set to zero. My question is, why does it not exist any interaction terms with the base levels for treatment=placebo and improved=none?

Because they are set to 0 and 0 multiplied by anything is still 0, and that's what dummy coding does.

Think about why treatment = placebo does not show up: It's set to 0 to allow the other levels of treatment to be compared to it.

Same with treatment = placebo in interactions: treatment = placebo, improved = some is set to 0 to allow it to be compared to treatment = treatment, improved = some.

There are other parameterizations of categorical variables that do not do this, exactly. Personally, I find those harder to interpret, but you can look at Helmert coding or effect coding for example.

Related Solutions

Solved – Understanding over-dispersion as it relates to the Poisson and the Neg. Binomial

This answer comes in two parts. The first addresses the issue of the standard errors and why that implies the models are not identical, as @ndoogan observed in comments, and the second addresses, partially, the issue of when the coefficient estimates might be substantially different.

Consider, for example, a hypothesis test on the coefficient of log(Vmaj) where the null is that the coefficient equals 0.5. The two sets of model estimates would result in rejection of the null in the Poisson case, unless testing at a very high confidence level, and failure to reject the null in the NB case.

More generally, there is more to a collection of estimates than just the point estimates. In the presence of (Negative binomial-like) overdispersion, the standard errors produced by the Negative Binomial-based model will be more accurate estimates of the underlying standard deviation of the coefficient estimates. Thus, the NB model as a whole is more accurate.

For example, a simple model with $\mu_i = \exp\{1 + x_i\}$, and $y_i \sim NB$ with mean $\mu_i$ and variance $5\mu_i$:

N <- 1000
mu <- exp(1 + (x <- rnorm(N)))

p <- 0.2
r <- mu * p / (1-p)
y <- rnbinom(N, r, p)

poisson_summary = summary(glm(y~x, family="poisson"))
nb_summary = summary(glm.nb(y~x))

# Parametric bootstrap calculation of the s.e. of the coefficient of x

coef_x <- rep(0,1000)
for (i in 1:length(coef_x)) {
   mu <- exp(1 + (x <- rnorm(N)))
   r <- mu * p / (1-p)
   y <- rnbinom(N, r, p)
   coef_x[i] <- summary(glm(y~x, family="poisson"))$coefficients["x","Estimate"]
}
data.frame("Sim. SE" = sd(coef_x - 1), 
           "Poisson SE" = poisson_summary$coefficients["x", "Std. Error"],
       "N.B. SE" = nb_summary$coefficients["x", "Std. Error"])

     Sim..SE Poisson.SE    N.B..SE
1 0.03492467 0.01273354 0.03984743

where you can see that the simulated std. deviation of the coefficient estimate is roughly 3x the Poisson-based estimate of same, and much better estimated by the NB-based estimate of same.

The coefficient estimates are pretty much the same, as one might expect with this sample size and the std. errors above, so I won't take up space by displaying them.

Additionally, although this is often a minor effect when overdispersion is low, by weighting the observations with more accurate weights (i.e., better estimates of observation-specific variances), the parameter estimates themselves will be more accurate, well, asymptotically at any rate. The rule of thumb I learned long ago was that heteroskedasticity adjustments (for that is what they are, in essence) don't buy you much unless the differences between weights are on the order of 5x or more.

Note, however, that in small samples you may well get more accurate (in MSE terms) point estimates with the Poisson model if there isn't much overdispersion, because you are reducing the variability induced by estimating the dispersion parameter. Of course, this is almost certainly more than offset by the loss of accuracy in the standard errors of the coefficient estimates.

Solved – How to account for overdispersion in a glm with negative binomial distribution

I'm not sure how to correct the p-values. However you can typically examine the mean-variance assumption in a negative binomial regression by looking at the residuals versus fitted values plot.

If this plot of residuals versus fitted values is not (roughly) an amorphous, random cloud of data points, then you can try using quasi-Poisson regression. Another alternative is to construct your own mean-variance relationship using quasi-likelihood.

Hope this helps!

Best Answer

Related Solutions

Solved – Understanding over-dispersion as it relates to the Poisson and the Neg. Binomial

Solved – How to account for overdispersion in a glm with negative binomial distribution

Related Question