Solved – Testing significance of random effects in a linear mixed-effects model with interactions

interactionlme4-nlmemixed model

I would like to test the significance of all interactions in a 3 factor linear mixed-effects model. Factors A and B are fixed, and factor C is random. Using lmer, the full model is:

lmer(response ~ A*B + (1 + A*B|C), data.frame)

My question concerns getting p-values for the 2-way interactions and the main effect of C. I initially planned to use likelihood ratio tests to compare this full model to reduced models by removing one interaction at a time. But of course specifying the 3-way interaction automatically includes the 2-way interactions and main effect of C.

So I would like to know if there is a way to construct a reduced model that includes the 3-way interaction and omits only a single 2-way interaction (or the main effect of C).

But of course the LR-test has its limitations. I would appreciate any advice on other ways to obtain p-values for random effects in this model. I am aware of RLRsim, MCMC, bootMer. Are any of these preferable/more feasible for my situation?

Best Answer

tl;dr This is possible, but very tedious (especially for factors with more than 2 levels), and you might not want to do it after all. If you can decide exactly what you mean to test by testing the lower-level interactions, you can probably do it.

This is a little bit tricky partly for technical R-specific reasons, and partly for statistical/inferential reasons.

The statistically tricky part is that testing lower-level interactions in a model that also contains higher-level interactions is (depending on who you talk to) either (i) hard to do correctly or (ii) just plain silly (for the latter position, see part 5 of Bill Venables's "exegeses on linear models". The rubric for this is the principle of marginality. At the very least, the meaning of the lower-order terms depends sensitively on how contrasts in the model are coded (e.g. treatment vs. midpoint/sum-to-zero). My default rule is that if you're not sure you understand exactly why this might be a problem, you shouldn't violate the principle of marginality.

The technically tricky part is that R's formula language doesn't have a simple way to drop lower-order terms in the presence of higher-order interactions (in part because this stuff was all designed originally by researchers from the Nelder/Venables camp that doesn't think this is a sensible thing to do). I have done this in the past, in cases where I knew it was sensible, by constructing the model matrix, dropping the columns I didn't want, and using the remaining terms as explicit variables in the model. In particular, I had terms period (before vs. after) and ttt (treatment: control vs. removal) and wanted to measure the period:ttt interaction but setting the before-period difference between treatments to zero: for the default ("treatment") contrasts in R, this corresponds to setting $\beta_{\textrm{ttt}}$ to zero, or removing that column from the model matrix.

dd <- expand.grid(period=c("before","after"),
                  ttt=c("control","removal"))
## some trickery to get factor levels in the sensible order
dd[] <- lapply(dd,function(x) factor(x,levels=unique(x)))

You might think that ~period*ttt-period would be the way to specify the desired result, but it doesn't actually work:

colnames(model.matrix(~period*ttt-ttt,dd))
## [1] "(Intercept)"             "periodafter"            
## [3] "periodbefore:tttremoval" "periodafter:tttremoval" 

(neither does ~period+period:ttt, or any other sensible combination I could think of). Instead:

X <- model.matrix(~period*ttt,dd)
## also remove the intercept because this will get re-added
X <- X[,!colnames(X) %in% c("(Intercept)","tttremoval")]
colnames(X) <- c("period","period_ttt")  ## friendlier names

Now we can fit a model with y~period+period_ttt,data=X and get what we want.

How does this translate to your case?

the full model is:

lmer(response ~ A*B + (1 + A*B|C), data.frame)

My question concerns getting p-values for the 2-way interactions and the main effect of C.

If you want a p-value for the interaction of A:B you should probably compare the model above to

update(model, . ~ A + B + (1 + A + B | C))

I'm not sure what the "main effect of C" is -- is that the variation of the intercept among levels of C ? If so, you probably have to use the trick above to exclude the intercept from A*B, e.g.

update(model, . ~ A*B + (A2 + B2 + A2_B2 | C))

where A2, B2, and A2_B2 are dummy variables as above.

But it's up to you to make sure this makes sense.

Related Question