Solved – Interaction in logistic models with R. Using * operator or create an interaction variable

interactioninterpretationlogisticrregression

I want to test the presence of an interaction term in a logistic regression with glm().

The formula is:

Gest.hypertension ~ ART.conc + Parity + Age + Smoke + Nulliparity + Syst.disease.type

With Age the only continuous variable and the others categorical.

The interaction term is ART.conc (2 leves) x Parity (3 levels), which are also the variables of interest. I found two ways of representing the interaction in glm();

  • By explicitly defining ART.conc * Parity, which create both simple and interaction effect.
  • By creating an artificial new variable with interaction(ART.conc, Parity) and adding the simple terms.

This are the results, omitting the other controls variables (OR and p value presented):

model1: Gest.hypertension ~ ART.conc * Parity + ...

ART.conc->yes                       3.35     0.073 .
Parity->2                           7.15     0.001 **
Parity->3                             38     <0.001 ***
ART.conc->yes:Parity->2            0.262     0.054 .
ART.conc->yes:Parity->3           0.0532     0.009 **

model2: Gest.hypertension ~ interaction(ART.conc, Parity) + ART.conc + Parity + ...

interaction(ART.conc, Parity)->yes.1  3.35    0.072590 .  
interaction(ART.conc, Parity)->no.2   7.15    0.001206 ** 
interaction(ART.conc, Parity)->yes.2  6.26    0.002698 ** 
interaction(ART.conc, Parity)->no.3   38.05    0.000202 ***
interaction(ART.conc, Parity)->yes.3  6.78    0.011071 *  
ART.concyes                              NA         NA    
Parity2                                  NA         NA    
Parity3                                  NA         NA

so the two models give more or less the same results one one of the two terms is zero and that's ok. I have troubles understanding what happened when the two terms are different from zero. For example in the second model that keeping parity == 2 and passing ART.conc from no to yes, we pass from an OR of 7.15 to 6.26. Instead, applying what I understood of logistic regressions and interactions, the coeff for the case (ART.conc, Parity)->yes.2 should be
$$OR_{parity.2}*OR_{art.yes:parity.2} = 1.87$$
very different from 6.26…

Therefore, I'd like to know what I'm missing and which of the model should I use.

Thanks!

UPDATE!!!

Thanks to @gavin simpson answer I thought I solved the problem, and ditched the interaction() trick. But then I noticed a pretty silly error by me.

We said that to interpretate the interaction you do
$$OR_{simple.eff}*OR_{interaction.eff}$$
Taking in example the case ART.yes/Parity.2. I said that to compute the effect of interaction of ART on Parity with the classical method you should do
$$OR_{art.yes}*OR_{art.yes:parity.2} ≈ 3.35*0.262 ≈ 0.87$$
But if you perform
$$OR_{parity.2}*OR_{art.yes:parity.2} ≈ 7.15*0.262 ≈ 1.87$$
I was troubled by this disparity. I need a number that could describe the overall risk of the pathology in the above mentioned case, and instead i got two numbers that i didn't know how to describe.
Trying to solve this I noticed my error. To compute the overall effect, is not enough to multiply one simple effect for the interaction effect, but you should add the other simple effect too!
$$OR_{simple.eff.1}*OR_{simple.eff.2}*OR_{interaction.eff}$$
or
$$OR_{art.yes}*OR_{parity.2}*OR_{art.yes:parity.2} ≈ 3.35*7.15*0.262 ≈ 6.26$$

6.26 is the same OR for the interaction(ART.conc, Parity)->yes.2 that is produced using the interaction trick.

Therefore, using interaction(), with drop=T option and not putting at all the simple effects by themselves (to avoid the NAs), you have that model 1 and model 2 give the same results, with model two including the already computed effect of the interaction.

Best Answer

Using some dummy data similar to what you must have used:

set.seed(10)
dat <- data.frame(ART.conc = factor(sample(c("yes","no"), 10, replace = TRUE),
                                    levels = c("yes","no")),
                  Parity   = factor(sample(1:3, 10, replace = TRUE)))

and for simplicity, form the interaction variable and store it in dat:

dat <- transform(dat, interact = interaction(ART.conc, Parity))

OK, now the model.matrix will show us the conversion to terms in the model the represent the factor variables. First, the correct way:

> model.matrix( ~ ART.conc * Parity, data = dat)
   (Intercept) ART.concno Parity2 Parity3 ART.concno:Parity2 ART.concno:Parity3
1            1          1       1       0                  1                  0
2            1          0       1       0                  0                  0
3            1          0       0       0                  0                  0
4            1          1       1       0                  1                  0
5            1          0       1       0                  0                  0
6            1          0       1       0                  0                  0
7            1          0       0       0                  0                  0
8            1          0       0       0                  0                  0
9            1          1       1       0                  1                  0
10           1          0       0       1                  0                  0

And now the way you did it:

> model.matrix( ~ interact + ART.conc + Parity, data = dat)
   (Intercept) interactno.1 interactyes.2 interactno.2 interactyes.3
1            1            0             0            1             0
2            1            0             1            0             0
3            1            0             0            0             0
4            1            0             0            1             0
5            1            0             1            0             0
6            1            0             1            0             0
7            1            0             0            0             0
8            1            0             0            0             0
9            1            0             0            1             0
10           1            0             0            0             1
   interactno.3 ART.concno Parity2 Parity3
1             0          1       1       0
2             0          0       1       0
3             0          0       0       0
4             0          1       1       0
5             0          0       1       0
6             0          0       1       0
7             0          0       0       0
8             0          0       0       0
9             0          1       1       0
10            0          0       0       1

The obvious thing is that these two are not the same because there are far more terms in the design matrix for the method using interaction() than for the correct way.

Now, we also note why in the model with the interaction() term you get NA values for some coefficients. Compare the interactyes.3 column with the Parity3 column in the design matrix from the interaction() model. Look familiar? So the interaction() has created information that relates to the main effects terms you added to the formula alongside the interaction() variable. At the very least, you don't need to specify the main effects terms alongside the interaction() variable in that version of the model. However, it'd still be wrong even if you fixed that.

Consider

> with(dat, levels(interact))
[1] "yes.1" "no.1"  "yes.2" "no.2"  "yes.3" "no.3" 
> with(dat, table(interact))
interact
yes.1  no.1 yes.2  no.2 yes.3  no.3 
    3     0     3     3     1     0

What interaction() has done (by default, I'll add) is retain all possible combinations of the levels of the two factors; that's 6 even though we only observed 4 of them in this dummy data set. This shows up in the model matrix as a vector of 0s for both interactno.1 and interactno.3 if you just include the interaction() term in the model:

> model.matrix( ~ interact, data = dat)
   (Intercept) interactno.1 interactyes.2 interactno.2 interactyes.3
1            1            0             0            1             0
2            1            0             1            0             0
3            1            0             0            0             0
4            1            0             0            1             0
5            1            0             1            0             0
6            1            0             1            0             0
7            1            0             0            0             0
8            1            0             0            0             0
9            1            0             0            1             0
10           1            0             0            0             1
   interactno.3
1             0
2             0
3             0
4             0
5             0
6             0
7             0
8             0
9             0
10            0

Hence we won't be able to identify both of two all 0 columns, so you'll still get an NA.

In summ, the two ways you mention are not the same model and you'll cause yourself a world of pain if you try to fit interactions using the interaction() function.

Hopefully by now you realize these forms aren't the same and why, and more importantly why it is a bad idea to circumvent R's formula interface notation.

Aside

I mentioned the default behaviour of interaction(). You can do better by adding the drop = TRUE to the call:

> with(dat, levels(interact2))
[1] "yes.1" "yes.2" "no.2"  "yes.3"\

Now we only have the four observed levels. Then you can do:

> model.matrix( ~ ART.conc + Parity + interact2, data = dat)
   (Intercept) ART.concno Parity2 Parity3 interact2yes.2 interact2no.2
1            1          1       1       0              0             1
2            1          0       1       0              1             0
3            1          0       0       0              0             0
4            1          1       1       0              0             1
5            1          0       1       0              1             0
6            1          0       1       0              1             0
7            1          0       0       0              0             0
8            1          0       0       0              0             0
9            1          1       1       0              0             1
10           1          0       0       1              0             0
   interact2yes.3
1               0
2               0
3               0
4               0
5               0
6               0
7               0
8               0
9               0
10              1

But then this still isn't right because you have one more term than needed, again because R can't do the right thing about removing one of the combinations of levels which gets absorbed into the Intercept term. Note the Parity3 and interact2yes.3 terms; they are coded the same way; the Parity3 term works with the Intercept term to give you the information in interact2yes.3.

Related Question