Solved – Interpretation of odds ratios in a multiple logistic regression with interaction

binomial distributiongeneralized linear modellogisticodds-ratior

I'm trying to formulate, for a report for non-scientists, the ouptut of a binomial model. However, I have some troubles with the log odds ratios, probabilities and stuff. I read some related topic but did not find help for multiple regressions.

I'm trying to predict for example the sex (Female/Male) based on two numeric personality variables: Antagonism and Negative Affect.

Here's a reproducible example:

library(neuropsychology)
fit <- glm(Sex~Antagonism*Negative_Affect, data=neuropsychology::personality, family=binomial(link = "logit"))

Here's the output of the summary:

glm(formula = Sex ~ Antagonism * Negative_Affect, family = binomial(link = "logit"), 
    data = df)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.9404  -0.6662  -0.5104  -0.3404   2.6423  

Coefficients:
                           Estimate Std. Error z value Pr(>|z|)    
(Intercept)                -1.66100    0.33549  -4.951 7.38e-07 ***
Antagonism                  0.96291    0.16068   5.993 2.06e-09 ***
Negative_Affect            -0.31413    0.10650  -2.950  0.00318 ** 
Antagonism:Negative_Affect -0.09675    0.04401  -2.198  0.02794 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1351.6  on 1326  degrees of freedom
Residual deviance: 1194.7  on 1323  degrees of freedom
AIC: 1202.7

Number of Fisher Scoring iterations: 5

What interest me are the coefficients obtained by running coef(fit):

(Intercept)   Antagonism    Negative_Affect       Antagonism:Negative_Affect 
-1.6609981     0.9629119       -0.3141261                 -0.0967504 

I understood that these coefficients are expressed in terms of log odd ratios.

So, when my two variables are 0, the odds for being a Male are exp(-1.66)=0.18. If I transform that to probabilities, it returns neuropsychology::odds_to_probs(0.18, log=F)=0.15, so 15%. However, the other coefficients are somehow hard to interpret. Indeed, An increase of 1 on the antagonism scale should increase the log odds of being a Male of 0.96. But does that mean that it increases the probability of neuropsychology::odds_to_probs(0.96, log=T)=0.72, 72% ? Or that the log odds change from -1.66 to -1.66 + 0.96 = -0.7, which corresponds to 33% ? (a change of 33 – 15 = 18%)?

And what about the interaction effect?

Is there a way to explain it to people with no scientific background, in terms of probability change? Thanks!

Best Answer

There's a problem with your arithmetic. You can either exponentiate the intercept to get the odds, exponentiate the coefficient for antagonism to get the odds ratio, and then multiply them to get the new odds (which you could convert into a probability). Or, you just solve for the linear predictor, and then convert that from a log odds to a probability. Either way, it gives you the same probability (.33).

cofs = c(-1.6609981, 0.9629119, -0.3141261, -0.0967504)
plogis(cofs[1])
# [1] 0.1596281
odds.antag1 = exp(cofs[1])*exp(cofs[2]);  odds.antag1
odds.antag1/(1+odds.antag1)
# [1] 0.3322367
plogis(cofs[1] + cofs[2]*1 + cofs[3]*0 + cofs[4]*1*0)
# [1] 0.3322367

As far as how to communicate this to laypeople, I mostly don't try in any direct sense. Clients usually want to know if X is associated with Y. The test of the relevant coefficient provides that information. If they want to see how this plays out, you would do best to plot. It's worth remembering that on the probability scale, all variables are in essence always interacting because of the nonlinear transformation at the heart of logistic regression.

It may help to read some of my answers that relate to these issues:

Related Question