Solved – main effect in logistic regression with the presence of interaction

interactionlogisticregression

I just have a question about how to get the main effect in the presence of interaction effect.

I have two cohort: say cohort A and cohort B . For cohort A, I have this code as 1. Zero for cohort B. So the cohort variable is 0 for cohort B, and equal 1 for cohort A.

I have an interaction term (also binary), say gender (sex). So sex=1 for Male and sex=0 for Female.

My model is : Probability of smoking = beta0 + cohortbeta1 + sexbeta2 + cohortsexbeta3

I know that I cannot just take the coffecient of cohort (i.e. beta1) from say SAS to be the effect of "being in Cohort A vs cohort B" in modelling the probability of smoking.

For example if beta1 = 0.5, I know that I CANNOT just take exp(0.5) = 1.65 to say "Oh being in cohort A is 1.65 times more likely to smoke than cohort B".

I know we need to take into consideration of being Male or Female at the same time.

So my question is:

from this interaction model (i.e. with the 4 betas there including the intercept term), is there anyway I can get or calculate the cohort effect? (say being in cohort A is XX times more likely to smoke than being in cohort B)

or there is simply no way of calculating the cohort effect WITHOUT considering the sex effect at the same time? (I mean the conclusion from this model has to be like this: Being Female in Cohort A is XX times more likely to smoke than being female in Cohort B)

What I am asking is: in this model, there is no way to make any interpretation of the cohort effect without ACCOUNTING for the sex effect? is that true? like we always have to include "Being female", "Being Male" (i.e. the sex variable levels) at the same time ???

If there is a way of just saying "Being in Cohort A is XX times more likely to smoke than being in Cohort B without worrying about which sex the person is in", could you let me know how? and specifically which software do you think can give me that value? I am using SAS, but I cannot figure out how to isolate the cohort effect without considering the SEX effect. Also the cohort*sex effect is significant.

thank you very much

Best Answer

In addition to the multiplicative marginal effect given by @ChrisNovak, you can also calculate the additive marginal effect. Using the same smoker data, first we get the exponentiated $\beta$s for comparison:

. logit smoker i.cohort##i.sex, or nolog;

Logistic regression                               Number of obs   =         80
                                                  LR chi2(3)      =      25.73
                                                  Prob > chi2     =     0.0000
Log likelihood = -42.187227                       Pseudo R2       =     0.2337

------------------------------------------------------------------------------
      smoker | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      cohort |
          B  |         36   33.54102     3.85   0.000      5.79752    223.5438
             |
         sex |
          F  |   3.857143   3.436216     1.52   0.130     .6729071    22.10937
             |
  cohort#sex |
        B#F  |   .0972222   .1114662    -2.03   0.042     .0102767    .9197636
             |
       _cons |   .1111111   .0828173    -2.95   0.003     .0257816    .4788568
------------------------------------------------------------------------------

You can calculate the average finite difference (i.e., the discrete analogue of the derivative) for cohort as if everyone was male and then as if everyone was female:

. margins sex, dydx(cohort);

Conditional marginal effects                      Number of obs   =         80
Model VCE    : OIM

Expression   : Pr(smoker), predict()
dy/dx w.r.t. : 2.cohort

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
2.cohort     |
         sex |
          M  |         .7   .1118034     6.26   0.000     .4808694    .9191306
          F  |         .3        .15     2.00   0.046     .0060054    .5939946
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

You want the difference between F and M, which is 0.3-0.7=-0.4.

In Stata, this can actually be done in one step (with SEs):

. margins r.sex, dydx(cohort);

Contrasts of conditional marginal effects
Model VCE    : OIM

Expression   : Pr(smoker), predict()
dy/dx w.r.t. : 2.cohort

------------------------------------------------
             |         df        chi2     P>chi2
-------------+----------------------------------
1b.cohort    |
         sex |  (omitted)
-------------+----------------------------------
2.cohort     |
         sex |          1        4.57     0.0325
------------------------------------------------

--------------------------------------------------------------
             |   Contrast Delta-method
             |      dy/dx   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
2.cohort     |
         sex |
   (F vs M)  |        -.4   .1870829     -.7666757   -.0333243
--------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the
      base level.

Here's the complete Stata Code for anyone interested:

#delimit;
clear;

/* Create Fake Data */
input
str1 cohort str1 sex smoker n;
"A"    "M" 0  18;
"A"    "F" 0  14;  
"B"    "M" 0  4;
"B"    "F" 0  8;
"A"    "M" 1  2;
"A"    "F" 1  6;  
"B"    "M" 1  16;
"B"    "F" 1  12;
end;

sencode cohort, replace;
sencode sex, replace;
expand n;
drop n;

/* Check the Odds */ 
table sex cohort, c(mean smoker);

logit smoker i.cohort##i.sex, or nolog;
margins sex, dydx(cohort);
margins r.sex, dydx(cohort);

Related Solutions

Solved – Interpretation of interaction terms if main effect is insignificant

Rather than saying the relationship is stronger, I think it's more precise to say that weight increases significantly more quickly with height for males than for females. Strength of relationship would be measured by measures like $R^2$, and these are affected not only by the rate of increase of one variable with another, but by the amount of noise in the data. e.g. if the data were something like this:

maleheight <- rnorm(1000, 70, 3)
femaleheight <- rnorm(1000, 65, 2.5)
maleweight <- maleheight*2.2 + rnorm(1000, 0, 20)
femaleweight <- femaleheight*1.3 + rnorm(1000, 0, 10)
height <- c(maleheight, femaleheight)
weight <- c(maleweight, femaleweight)
male <- c(rep(1, 1000), rep(0, 1000))
data <- data.frame(cbind(height, weight, male))

and the model

m1 <- with(data, lm(weight~height + male + height*male))
summary(m1)

shows your pattern, but the relationship looks stronger for women

Solved – Interpreting coefficients in a logistic regression model with a categorical variable having more than 2 levels

If you write out the fitted model for the log odds of smoking

$$\log \frac{\Pr(Y=1)}{\Pr(Y=0)} = -4.380\,1 + -0.324\,56\ I_\mathrm{teen} + 1.451\,19 \ I_\mathrm{mature} + -0.989\,1\ I_\mathrm{old}$$

where the dummies are $$I_\mathrm{teen}=\left\{ \begin{array}{l l} 0 & X\neq\mathrm{teenager}\\ 1& X=\mathrm{teenager}\\ \end{array}\right.$$ &c., you can confirm your calculations. Note though that "likely" is ambiguous—it might be taken as referring to probability—& you might prefer to say something like "the odds of a teenager's smoking are 28% lower than those of an adult's smoking" in a formal or didactic context.

Best Answer

Related Solutions

Solved – Interpretation of interaction terms if main effect is insignificant

Solved – Interpreting coefficients in a logistic regression model with a categorical variable having more than 2 levels

Related Question