Solved – How to account for repeated measures in glmer

glmmlme4-nlmerrepeated measures

My design is as follows.

$y$ is Bernoulli response
$x_1$ is a continuous variable
$x_2$ is a categorical (factor) variable with two levels

The experiment is completely within subjects. That is, each subject receives each combination of $x_1$ and $x_2$.

This is a repeated measures logistic regression set-up. The experiment will give two ogives for $p(y=1)$ vs $x_1$, one for level1 and one for level2 of $x_2$. The effect of $x_2$ should be that for level2 compared to level1, the ogive should have a shallower slope and increased intercept.

I am struggling with finding the model using lme4. For example,

glmer(y ~ x1*x2 + (1|subject), family=binomial)

So far as I understand it, the 1|subject part says that subject is a random effect. But I do not see how to specify that $x_1$ and $x_2$ are repeated measures variables. In the end, I want a model that includes a random effect for subjects, and gives estimated slopes and intercepts for level1 and level2.

Best Answer

tl;dr: Your model already accounts for the fact that you have repeated measures. Nonetheless, if it fits, you would do best to use:

glmer(y ~ x1*x2 + (x1:x2|subject), family=binomial)

but if that isn't tractable, you could try:

glmer(y ~ x1*x2 + (1|subject) + (0+x1|subject) + (0+x2|subject), family=binomial)

_{For an explanation of the syntax here, see: R's lmer cheat-sheet.}

Full version: You don't need to "tell" R that $x_1$ and $x_2$ are repeated measures variables. (This is really just a small semantic distinction, but) I wouldn't say that variables can be "repeated measures variables" vs. "non-repeated measures variables". Variables are just variables. I would say that, e.g., 'variable 1 is measured within patients, and variable 2 is measured between patients' or something like that. Of course, your phrasing is fine, you just don't want it to lead to some confusion where you think of repeated measures-ness as some ontological status intrinsic to the variable.

At any rate, instead of telling R that a variable is measured within people, you simply need to formulate a model using random and/or effects fixed to account for the non-independence of the data that come from the same person. (Yes, you can use a fixed effect to account for this: every person would be a level of a categorical variable that is included. However, this will answer a slightly different question—almost certainly not the one you are interested in—and unless you have many measurements on the same person in every combination of conditions, the model will not be tractable.) In practice, you will use random effects to account for this. Specifically, you will have a random effect for each subject.

Next you need to specify what you want random effects for. The syntax you used, (1|subject), will cause R to include a random intercept for each person. This will shift someone's line of best fit up or down relative to the mean. You should think about whether people are also likely to differ in their slopes—i.e., how strongly they respond to changes in your variables. You should also think about whether the random effects are correlated with each other, e.g., maybe people who start off higher when $x_1=0$ tend to also respond more strongly to increases in $x_1$. Common advice is to include all possible random effects and intercorrelations (Barr et al., 2013, "Keep it maximal", pdf). However, bear in mind that GLMMs are more difficult computationally than LMMs, so such a model may not be tractable.

Related Solutions

Solved – How to estimate correlation among repeated measures

When you include subject as a random effect in ANOVA you assume that the subject effect and the product effect are additive. Have you thought about negative covariances? Maybe the more one likes red socks the less they like green ones...

Repeated Measures ANOVA – Using LME/LMER in R for Two Within-Subject Factors

What you're fitting with aov is called a strip plot, and it's tricky to fit with lme because the subject:A and subject:B random effects are crossed.

Your first attempt is equivalent to aov(Y ~ A*B + Error(subject), data=d), which doesn't include all the random effects; your second attempt is the right idea, but the syntax for crossed random effects using lme is very tricky.

Using lme from the nlme package, the code would be

lme(Y ~ A*B, random=list(subject=pdBlocked(list(~1, pdIdent(~A-1), pdIdent(~B-1)))), data=d)

Using lmer from the lme4 package, the code would be something like

lmer(Y ~ A*B + (1|subject) + (1|A:subject) + (1|B:subject), data=d)

These threads from R-help may be helpful (and to give credit, that's where I got the nlme code from).

http://www.biostat.wustl.edu/archives/html/s-news/2005-01/msg00091.html

http://permalink.gmane.org/gmane.comp.lang.r.lme4.devel/3328

http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg10843.html

This last link refers to p.165 of Pinheiro/Bates; that may be helpful too.

EDIT: Also note that in the data set you have, some of variance components are negative, which is not allowed using random effects with lme, so the results differ. A data set with all positive variance components can be created using a seed of 8. The results then agree. See this answer for details.

Also note that lme from nlme does not compute the denominator degrees of freedom correctly, so the F-statistics agree but not the p-values, and lmer from lme4 doesn't try too because it's very tricky in the presence of unbalanced crossed random effects, and may not even be a sensible thing to do. But that's more than I want to get into here.

Best Answer

Related Solutions

Solved – How to estimate correlation among repeated measures

Repeated Measures ANOVA – Using LME/LMER in R for Two Within-Subject Factors

Related Question