Solved – How to test main effects of categorical variables in a binary logistic regression including an interaction

logisticrregression

I measure two binary responses from each participant (ChoiceVA = V or A, AestheticOnly = 0 or 1). There are two experiments (between-participant). I want to test the following hypotheses:

AestheticOnly depends on Experiment (main effect)
AestheticOnly depends on ChoiceVA (main effect)
The way AestheticOnly depends on Experiment depends on ChoiceVA (interaction)

Here is my data. The first number in each cell is the proportion of participants scoring 1 for AestheticOnly, and the second number is the n for participants in that cell.

                         ChoiceVA               
                        A       V     All

Experiment  1      0.1463  0.3939  0.2568
                       41      33      74

            2      0.4545  0.2619  0.3281
                       22      42      64

            All    0.2540  0.3200  0.2899
                       63      75     138

Just from looking at the data it is pretty obvious that neither main effect is significant (e.g. for ChoiceVA, bottom row, .25 of 63 participants is not significantly different from .32 of 75 participants). In my naivity I thought perhaps I could test these hypotheses with a straightforward binary logistic regression:

> mod <- glm( AestheticOnly ~ Experiment+ChoiceVA+Experiment*ChoiceVA, data = d, family=binomial )
> summary(mod)

Call:
glm(formula = AestheticOnly ~ Experiment + ChoiceVA + Experiment *
    ChoiceVA, family = binomial, data = d)

Deviance Residuals:
    Min       1Q   Median       3Q      Max 
-1.1010  -0.7793  -0.5625   1.2557   1.9605 

Coefficients:
                      Estimate Std. Error z value Pr(>|z|)   
(Intercept)            -1.7636     0.4419  -3.991 6.57e-05 ***
Experiment2             1.5813     0.6153   2.570  0.01017 * 
ChoiceVAV               1.3328     0.5676   2.348  0.01887 * 
Experiment2:ChoiceVAV  -2.1866     0.7929  -2.758  0.00582 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 166.16  on 137  degrees of freedom
Residual deviance: 157.01  on 134  degrees of freedom
AIC: 165.01

Number of Fisher Scoring iterations: 4

Clearly, the main effects are not being tested here in the way I hoped. I believe that this model, in testing main effects, rather than testing e.g. ChoiceVA=A against ChoiceVA=V across both levels of Experiment, is confining itself to that comparison only when Experiment=1. Can a model be constructed that instead tests the main effects in the way I would like?

This is related to a previous question (Logistic regression gives very different result to Fisher's exact test – why?), but when I asked it I understand this even worse than I do now and consequently the question was so unclear that I need to start again.

Best Answer

I have found a method which "works" in the sense that it produces p-values which are clearly appropriate. Because of the simplicity of my data-set, I know the main effects should not be significant (Fisher's exact test tells me that), but it looks very much like the interaction should be. This is the only method I have found giving output in line with this. However I am not sure if this method is valid in other senses. I would very much appreciate comment on that. Here it is:

> library(car)
> Anova(glm( AestheticOnly ~ Experiment*ChoiceVA, data = d, family=binomial ), type=2)
Analysis of Deviance Table (Type II tests)

Response: AestheticOnly
                    LR Chisq Df Pr(>Chisq)   
Experiment            0.5768  1   0.447582   
ChoiceVA              0.4583  1   0.498433   
Experiment:ChoiceVA   7.8432  1   0.005101 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Cheers,

Ben

Related Question