Solved – How to test main effects of categorical variables in a binary logistic regression including an interaction

I measure two binary responses from each participant (ChoiceVA = V or A, AestheticOnly = 0 or 1). There are two experiments (between-participant). I want to test the following hypotheses:

AestheticOnly depends on Experiment (main effect)
AestheticOnly depends on ChoiceVA (main effect)
The way AestheticOnly depends on Experiment depends on ChoiceVA (interaction)

Here is my data. The first number in each cell is the proportion of participants scoring 1 for AestheticOnly, and the second number is the n for participants in that cell.

                         ChoiceVA               
                        A       V     All

Experiment  1      0.1463  0.3939  0.2568
                       41      33      74

            2      0.4545  0.2619  0.3281
                       22      42      64

            All    0.2540  0.3200  0.2899
                       63      75     138

Just from looking at the data it is pretty obvious that neither main effect is significant (e.g. for ChoiceVA, bottom row, .25 of 63 participants is not significantly different from .32 of 75 participants). In my naivity I thought perhaps I could test these hypotheses with a straightforward binary logistic regression:

> mod <- glm( AestheticOnly ~ Experiment+ChoiceVA+Experiment*ChoiceVA, data = d, family=binomial )
> summary(mod)

Call:
glm(formula = AestheticOnly ~ Experiment + ChoiceVA + Experiment *
    ChoiceVA, family = binomial, data = d)

Deviance Residuals:
    Min       1Q   Median       3Q      Max 
-1.1010  -0.7793  -0.5625   1.2557   1.9605 

Coefficients:
                      Estimate Std. Error z value Pr(>|z|)   
(Intercept)            -1.7636     0.4419  -3.991 6.57e-05 ***
Experiment2             1.5813     0.6153   2.570  0.01017 * 
ChoiceVAV               1.3328     0.5676   2.348  0.01887 * 
Experiment2:ChoiceVAV  -2.1866     0.7929  -2.758  0.00582 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 166.16  on 137  degrees of freedom
Residual deviance: 157.01  on 134  degrees of freedom
AIC: 165.01

Number of Fisher Scoring iterations: 4

Clearly, the main effects are not being tested here in the way I hoped. I believe that this model, in testing main effects, rather than testing e.g. ChoiceVA=A against ChoiceVA=V across both levels of Experiment, is confining itself to that comparison only when Experiment=1. Can a model be constructed that instead tests the main effects in the way I would like?

This is related to a previous question (Logistic regression gives very different result to Fisher's exact test – why?), but when I asked it I understand this even worse than I do now and consequently the question was so unclear that I need to start again.

> library(car) > Anova(glm( AestheticOnly ~ Experiment*ChoiceVA, data = d, family=binomial ), type=2) Analysis of Deviance Table (Type II tests) Response: AestheticOnly LR Chisq Df Pr(>Chisq) Experiment 0.5768 1 0.447582 ChoiceVA 0.4583 1 0.498433 Experiment:ChoiceVA 7.8432 1 0.005101 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Best Answer

I have found a method which "works" in the sense that it produces p-values which are clearly appropriate. Because of the simplicity of my data-set, I know the main effects should not be significant (Fisher's exact test tells me that), but it looks very much like the interaction should be. This is the only method I have found giving output in line with this. However I am not sure if this method is valid in other senses. I would very much appreciate comment on that. Here it is:

Cheers,

Ben

Best Answer

Related Solutions

Solved – Very large theta values using glm.nb in R – alternative approaches

Solved – How to account for overdispersion in a glm with negative binomial distribution

Related Question