Mixed Model – How to Analyze a Dataset with 2 Between- and 2 Within-Factors Using lme4 in Repeated Measures

lme4-nlmemixed modelrepeated measures

This post is a follow up on my previous post (which was interested in lme) and uses the same dataset. Now I would like to know how to analyze it using lme4.

The data

The data is from a behavioral experiment in which participants in 6 groups (based on two crossed factors) worked on 16 trials (two crossed 4-level factors). That is, we have a dataset d with two between-subject factors, group and condition, and two within-subject factors (i.e., repeated-measures factors), topic and problem (I uploaded the data to pastebin, so everybody should be able to obtain it), the participant id is code, the dv is mean:

> d <- read.table(url("http://pastebin.com/raw.php?i=4hRFyaRj"), colClasses = c(rep("factor", 6), "numeric"))
> str(d)
'data.frame':   2928 obs. of  6 variables:
  $ code     : Factor w/ 183 levels "A03U","A08C",..: 1 1 1 1 1 1 1 1 1 1 ...
  $ group    : Factor w/ 2 levels "control","experimental": 2 2 2 2 2 2 2 2 2 2 ...
  $ condition: Factor w/ 3 levels "alternatives",..: 3 3 3 3 3 3 3 3 3 3 ...
  $ topic    : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 2 2 2 2 3 3 ...
  $ problem  : Factor w/ 4 levels "AC","DA","MP",..: 3 4 1 2 3 4 1 2 3 4 ...
  $ mean     : num  94.5 94.5 86.5 84.5 80 46.5 73.5 43.5 51 39 ...

The usual way to analyze this data would to fit an ANOVA on this data (note how the error term is constructed for the within-subject factors):

m1 <- aov(mean ~ (condition*group*problem*topic) + Error(code/(problem*topic)), d)

The Question

My main interest in the data is the following:

Is there an effect of the group factor on any level (i.e., main effect or interaction)? I hope there is not.
Is there an interaction of condition with problem ? Or even an interaction of condition with problem and topic?

I have two questions regarding the analysis in lme4:

How can I specify these questions using lme4?
As lme4 does not provide p-values, how do I determine whether a variable (e.g., group) has any effect (I imagine using some kind of likelihood ratio test) and what is the critical value above which I need to accept effect to be 'significant'?

As is probably obvious from the above description I am no expert in lme4 neither a statistician, so both Venables & Ripley and the lme4 Book by Bates gave me a hard time. Leaving me kind of clueless as before.

Best Answer

I believe that lmer won't be able to duplicate what comes out of aov because it does not have the capability of restricting the variance-covariance matrix of the random components to compound symmetry as done in aov. However, you can still try something like

require(lme4)
# assuming a simple symmetric positive-definite structure of variance-covariance matrix 
anova(m2 <- lmer(mean ~ condition*group*problem*topic + (0+problem | code) + (0+topic | code), data = d))

or a simple model

anova(m3 <- lmer(mean ~ condition*group*problem*topic + (1|code), data = d))

Then you can compare the two models:

anova(m2, m3)

Models:
m3: mean ~ condition * group * problem * topic + (1 | code)
m2: mean ~ condition * group * problem * topic + (0 + problem | code) + 
m2:     (0 + topic | code)
    Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)    
m3  98 24985 25572 -12395                            
m2 117 24899 25599 -12332 124.4     19  < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The above result indicates the complicated model (the 1st one) is much better. In terms of model complexity, m1 with aov in your OP fits between m2 and m3.

To obtain p-values and confidence intervals for specific effects, do this

require(languageR)
set.seed(101)
# It will take a long time to run the MCMC simulations due to the huge number of effects in the model
mcmc2 <- pvals.fnc(m2, nsim=10000, withMCMC=TRUE)
mcmc2$fixed

The last line will show you the confidence levels and two p-values (one from MCMC simulations, and one from fitted t-statistic) for all the possible fixed effects specified in the model. I am not copying the result here because it's a long table.

If you want to know if the those groups have different variance-covariance structure, you may try

anova(m20 <- lmer(mean ~ condition*group*problem*topic + (0+problem|code/group) + (0+condition|code/group), data = d))

or,

anova(m21 <- lmer(mean ~ condition*group*problem*topic + (0+problem|code/condition) + (0+topic|code/condition), data = d))

And then compare the above two models with the simple one:

anova(m2, m20, m21)

Related Solutions

R Mixed Model – Specifying an LME Model with More Than One Within-Subjects Factor

I found an answer to my question on this thread: Repeated measures ANOVA with lme in R for two within-subject factors (somehow this thread was already one of my favorites, I must have forgotten about it). The specification is a little unhandy.

m6 <- lme(mean ~ condition*group*problem*topic, 
   random = list(code=pdBlocked(list(~1, pdIdent(~problem-1), pdIdent(~topic-1)))), data = d)
anova(m6)

However, the denominator dfs are still wrong, as noted in the thread and apparent in comparisons between the ANOVA and lme dfs.

data.frame(effect = rownames(anova(m6)), denDf= anova(m6)$denDF)

m4$ANOVA[,c("Effect", "DFd")]

As long as there are no other ideas, I think I will need to do the analysis in lme4, for which I wil need to post another question.

Solved – How to assign degrees of freedom for two-way ANOVA with two within-subjects factors

I'm not sure I understand the question exactly, but if you are asking about the df for the two-way, factorial, within-subjects ANOVA, here they are:

A = a - 1, where a = number of levels of A
B = b - 1, where b = number of levels of B
A x B = (a - 1)(b - 1)
S = n - 1, where s = number of levels of S (i.e., number of subjects)
A x S = (a - 1)(n - 1)
B x S = (b - 1)(n - 1)
A x B x S = (a - 1)(b - 1)(n - 1)

E.g.:

A = cond (a = 3); B = rnd (b = 6); S (s = 44)
- df_A = 2
- df_B = 5
- df_{A x B} = 10
- df_S = 43
- df_{A x S} = 86
- df_{B x S} = 215
- df_{A x B x S} = 430