# Solved – How to interpret coefficients in a Poisson regression

generalized linear modelinterpretationpoisson distributionrregression coefficients

How can I interpret the main effects (coefficients for dummy-coded factor) in a Poisson regression?

Assume the following example:

treatment     <- factor(rep(c(1, 2), c(43, 41)),
levels = c(1, 2),
labels = c("placebo", "treated"))
improved      <- factor(rep(c(1, 2, 3, 1, 2, 3), c(29, 7, 7, 13, 7, 21)),
levels = c(1, 2, 3),
labels = c("none", "some", "marked"))
numberofdrugs <- rpois(84, 10) + 1
healthvalue   <- rpois(84, 5)
y             <- data.frame(healthvalue, numberofdrugs, treatment, improved)
test          <- glm(healthvalue~numberofdrugs+treatment+improved, y, family=poisson)
summary(test)


The output is:

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)       1.88955    0.19243   9.819   <2e-16 ***
numberofdrugs    -0.02303    0.01624  -1.418    0.156
treatmenttreated -0.01271    0.10861  -0.117    0.907   MAIN EFFECT
improvedsome     -0.13541    0.14674  -0.923    0.356   MAIN EFFECT
improvedmarke    -0.10839    0.12212  -0.888    0.375   MAIN EFFECT
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


I know that the incident rate for numberofdrugs is exp(-0.023)=0.977. But how do I interpret the main effects for the dummy variables?

The exponentiated numberofdrugs coefficient is the multiplicative term to use for the goal of calculating the estimated healthvalue when numberofdrugs increases by 1 unit. In the case of categorical (factor) variables, the exponentiated coefficient is the multiplicative term relative to the base (first factor) level for that variable (since R uses treatment contrasts by default). The exp(Intercept) is the baseline rate, and all other estimates would be relative to it.

In your example the estimated healthvalue for someone with 2 drugs, "placebo" and improvement=="none" would be (using addition inside exp as the equivalent of multiplication):

 exp( 1.88955 +    # thats the baseline contribution
2*-0.02303 + 0 + 0 )  # and estimated value will be somewhat lower
[1] 6.318552


While someone on 4 drugs, "treated", and "some" improvement would have an estimated healthvalue of

exp( 1.88955 + 4*-0.02303 + -0.01271 + -0.13541)
[1] 5.203388


ADDENDUM: This is what it means to be "additive on the log scale". "Additive on the log-odds scale" was the phrase that my teacher, Barbara McKnight, used when emphasizing the need to use all applicable term values times their estimated coefficients when doing any kind of prediction. You add first all the coefficients (including the intercept term) times eachcovariate values and then exponentiate the resulting sum. The way to return coefficients from regression objects in R is generally to use the coef() extractor function (done with a different random realization below):

 coef(test)
#   (Intercept)    numberofdrugs treatmenttreated     improvedsome   improvedmarked
#   1.18561313       0.03272109       0.05544510      -0.09295549       0.06248684


So the calculation of the estimate for a subject with 4 drugs, "treated", with "some" improvement would be:

 exp( sum( coef(test)[ c(1,2,3,4) ]* c(1,4,1,1) ) )
[1] 3.592999


And the linear predictor for that case should be the sum of:

 coef(test)[c(1,2,3,4)]*c(1,4,1,1)
#    (Intercept)    numberofdrugs treatmenttreated     improvedsome
#     1.18561313       0.13088438       0.05544510      -0.09295549


These principles should apply to any stats package that returns a table of coefficients to the user. The method and principles is more general than might appear from my use of R.

I'm copying selected clarifying comments since they 'disappear' in the default display:

Q: So you interpret the coefficients as ratios! Thank you! – MarkDollar

A: The coefficients are the natural_logarithms of the ratios. – DWin

Q2: In that case, in a poisson regression, are the exponentiated coefficients also referred to as "odds ratios"? – oort

A2: No. If it were logistic regression they would be but in Poisson regression, where the LHS is number of events and the implicit denominator is the number at risk, then the exponentiated coefficients are "rate ratios" or "relative risks".