Solved – Testing significance of random effects in a linear mixed-effects model with interactions

interactionlme4-nlmemixed model

I would like to test the significance of all interactions in a 3 factor linear mixed-effects model. Factors A and B are fixed, and factor C is random. Using lmer, the full model is:

lmer(response ~ A*B + (1 + A*B|C), data.frame)

My question concerns getting p-values for the 2-way interactions and the main effect of C. I initially planned to use likelihood ratio tests to compare this full model to reduced models by removing one interaction at a time. But of course specifying the 3-way interaction automatically includes the 2-way interactions and main effect of C.

So I would like to know if there is a way to construct a reduced model that includes the 3-way interaction and omits only a single 2-way interaction (or the main effect of C).

But of course the LR-test has its limitations. I would appreciate any advice on other ways to obtain p-values for random effects in this model. I am aware of RLRsim, MCMC, bootMer. Are any of these preferable/more feasible for my situation?

Best Answer

tl;dr This is possible, but very tedious (especially for factors with more than 2 levels), and you might not want to do it after all. If you can decide exactly what you mean to test by testing the lower-level interactions, you can probably do it.

This is a little bit tricky partly for technical R-specific reasons, and partly for statistical/inferential reasons.

The statistically tricky part is that testing lower-level interactions in a model that also contains higher-level interactions is (depending on who you talk to) either (i) hard to do correctly or (ii) just plain silly (for the latter position, see part 5 of Bill Venables's "exegeses on linear models". The rubric for this is the principle of marginality. At the very least, the meaning of the lower-order terms depends sensitively on how contrasts in the model are coded (e.g. treatment vs. midpoint/sum-to-zero). My default rule is that if you're not sure you understand exactly why this might be a problem, you shouldn't violate the principle of marginality.

The technically tricky part is that R's formula language doesn't have a simple way to drop lower-order terms in the presence of higher-order interactions (in part because this stuff was all designed originally by researchers from the Nelder/Venables camp that doesn't think this is a sensible thing to do). I have done this in the past, in cases where I knew it was sensible, by constructing the model matrix, dropping the columns I didn't want, and using the remaining terms as explicit variables in the model. In particular, I had terms period (before vs. after) and ttt (treatment: control vs. removal) and wanted to measure the period:ttt interaction but setting the before-period difference between treatments to zero: for the default ("treatment") contrasts in R, this corresponds to setting $\beta_{\textrm{ttt}}$ to zero, or removing that column from the model matrix.

dd <- expand.grid(period=c("before","after"),
                  ttt=c("control","removal"))
## some trickery to get factor levels in the sensible order
dd[] <- lapply(dd,function(x) factor(x,levels=unique(x)))

You might think that ~period*ttt-period would be the way to specify the desired result, but it doesn't actually work:

colnames(model.matrix(~period*ttt-ttt,dd))
## [1] "(Intercept)"             "periodafter"            
## [3] "periodbefore:tttremoval" "periodafter:tttremoval"

(neither does ~period+period:ttt, or any other sensible combination I could think of). Instead:

X <- model.matrix(~period*ttt,dd)
## also remove the intercept because this will get re-added
X <- X[,!colnames(X) %in% c("(Intercept)","tttremoval")]
colnames(X) <- c("period","period_ttt")  ## friendlier names

Now we can fit a model with y~period+period_ttt,data=X and get what we want.

How does this translate to your case?

the full model is:

lmer(response ~ A*B + (1 + A*B|C), data.frame)

My question concerns getting p-values for the 2-way interactions and the main effect of C.

If you want a p-value for the interaction of A:B you should probably compare the model above to

update(model, . ~ A + B + (1 + A + B | C))

I'm not sure what the "main effect of C" is -- is that the variation of the intercept among levels of C ? If so, you probably have to use the trick above to exclude the intercept from A*B, e.g.

update(model, . ~ A*B + (A2 + B2 + A2_B2 | C))

where A2, B2, and A2_B2 are dummy variables as above.

But it's up to you to make sure this makes sense.

Related Solutions

Mixed Model – Analyzing Interactions in the Random Effects Structure

It depends on your experimental design, specifically, which of your predictor variables vary within subjects and which are characteristics of subjects. Guessing that inhibitory ability is a fixed characteristic of subjects (at least over the time scale of the experiment), and that you test each subject multiple times with different combinations of the other predictor variables (I think this is what you mean by "repeated measure" above), I would say that

Reaction Time ~ Target_Race*Target_Obj*Inhib+
     (Target_Race*Target_Obj|SubjectID)

is probably what you want. That will work for lme4::lmer: for nlme::lme you would split the formula as follows

formula = Reaction Time ~ Target_Race*Target_Obj*Inhib,
    random = Target_Race*Target_Obj|SubjectID

The fixed-effect specification gives you all main effects, two- and three-way interactions; the random-effect specification allows for variation in baseline (intercept), main effects of both target-specific variables, and their interaction among subjects.

Statistical Significance – Why Main Effects Lose Significance When Interactions Are Included

Putting aside the issues with p-value calculations in mixed models (see this page and its links), there are a couple of things that could be going on, one of which seems more likely in your case.

First, models with interactions can use up a lot of degrees of freedom. In addition to the 10 degrees of freedom for the specified predictors in your first model (plus 1 for the intercept or grand mean and more for the random effect), each of their two-way, 3-way,...,9-way interactions is an additional parameter value that the second model needs to evaluate, diminishing the residual degrees of freedom. The resulting higher standard errors in coefficient estimates in a model with interactions thus can move a coefficient from "significant" to "non-significant" even if the point estimate of the coefficient is the same.

That doesn't seem to explain your result here, though. The estimated denominator degrees of freedom were typically in the high 300's in the first model, but still in the low 300s for the second model.

More likely here is the second possibility: that the interactions themselves are the interesting results. Look at the predictors involved in the interaction terms you chose to display: dir, core, fuzzy, B_atom, A_neg, B_neg. Those are all of the predictors that were significant, plus the one that came closest to the significance cutoff, in the first model. What the first model told you was that each of those predictors was related to outcome, if you hold constant (ignore) the values of all the other predictors. What the significant interaction terms in the second model told you was that you cannot fairly ignore the values of other predictors when you consider the relation of those individual predictors to outcome.

In that context of important interactions, the "significance" of any "main effect" doesn't really matter: it's the particular combinations of values of dir, core, fuzzy, B_atom, A_neg, and B_neg that matter. Those "main effects" didn't "disappear"; they just didn't reach a threshold of statistical significance on their own. You still need to know the values of the "main-effect" estimates to predict the outcome at any specified combination of the predictors.

Complicating the issue is that different software can report p-values for categorical predictors differently when there are interactions. Sometimes they are reported for a situation where all other predictors are at their reference levels. In that case it's quite possible to get a "non-significant" p-value for a predictor in that scenario, even if is highly associated with outcome at a non-reference level of another predictor due to an interaction. Other software includes a report of p-values for a predictor that includes all of its interactions. So you have to understand how the particular software you are using is presenting its p-values before you can even make the (often arbitrary) decision that the predictor is "non-significant."

Best Answer

Related Solutions

Mixed Model – Analyzing Interactions in the Random Effects Structure

Statistical Significance – Why Main Effects Lose Significance When Interactions Are Included

Related Question