Linear Mixed Models – Insights into Two-Way Repeated Measures Analysis

fixed-effects-modellme4-nlmemixed modelrrandom-effects-model

This is my first endeavor into linear mixed models, and I haven't found an example that uses a fully repeated measures design, so I was hoping that I could get some help.

I have a dataset that looks like this:

Mydata <- data.frame(
Subject = c(1,2,3,4,5),
Condition =c(rep(c("High", "Low"), each=50)),
Size =c(rep(c(8, 20, 50, 70, 100), each=10)),
Estimate =c(10, 12, 15, 18, 8, 12, 12, 10, 14, 8, 25, 36, 29, 45, 38, 28, 36, 29, 40, 36, 68, 75, 65, 78, 60, 69, 74, 63, 80, 62, 85, 99, 84, 100, 90, 82, 99, 88, 102, 85, 140, 150, 190, 180, 200, 130, 160, 190, 190, 210, 8, 6, 9, 8, 10, 7, 7, 8, 9, 12, 20, 21, 25, 30, 26, 22, 23, 22, 30, 25, 45, 40, 50, 60, 55, 40, 45, 55, 57, 58, 70, 80, 60, 80, 75, 75, 78, 65, 60, 70, 100, 115, 120, 125, 110, 110, 105, 120, 120, 110)
)
Mydata <- Mydata[order(Mydata$Subject),]

In the data, I have 2 within-subjects factor:

Set size (8, 20, 50, 70, 100); 2 repetitions per level (Note: this is a continuous/ordinal numeric variable)
Condition (High vs. low)

Each subject was presented with a dot array of different set sizes (8, 20, 50, 70, 100), with 2 repetitions per set size, and they had to estimate how many dots they saw. Each subject did the same task twice with different levels of calibration (high vs. low). Hence, each row represents a single trial based on condition and set size.

My goal is to examine if there are main effects of Condition and Set size, and if there is a Condition x Set Size interaction.

From what I have read, a linear mixed model is well-suited especially if I want to model Subjects as a random factor (and Condition and Set Size as fixed effects), and to minimize data aggregation across levels of a factor.

My first set of questions are: Can I use the trial-level data as they are now? My actual dataset has 15 levels of set size and 8 repetitions per set size. Or would it be more appropriate to aggregate across repetitions to obtain a mean value per set size before I fit the mixed model?

After reading up on how to use lme4 and nlme, I tried to fit the following random-intercepts-only models using lme4 (I also fitted them with nlme):

baseline.model <- lmer(Estimate ~ 1 + (1|Subject), Mydata)

To test for main effect of set size

setsize.model <- lmer(Estimate ~ Size + (1|Subject), Mydata)
anova(baseline.model, setsize.model)

To test for main effect of condition

condition.model <- lmer(Estimate ~ Size + Condition + (1|Subject), Mydata)
anova(setsize.model, condition.model)

To test for interaction

interaction.model <- lmer(Estimate ~ Size*Condition + (1|Subject), Mydata)
anova(condition.model, interaction.model)

My second set of questions are: Are these models appropriately fitted? Should I be considering nested models, e.g., Size and Condition within Subject? Should I also consider random slopes as it seems reasonable to suppose that the effects of Size and/or Condition, or even Size x Condition may vary across Subjects. I thought this may be critical if I trial-level data tend to be correlated within subjects. If so, is this the correct way to fit it?

interaction.model <- lmer(Estimate ~ Size*Condition + (1 + Condition + Size + Condition*Size|Subject), Mydata)

My final question is: With an expected Size x Condition interaction, how can I perform post-hoc tests to examine if there is an effect of Condition for each level of Size (I'm hypothesizing Condition effects for larger set sizes, but not for small set sizes)? I've tried the following, but the parameter estimates are identical for the 2 levels of Conditions for each level of Size, so I was wondering if the model was fitted wrongly in the first place:

lsmeans(interaction.model, ~ Condition | Size, adjust="tukey")

I would greatly appreciate any advice! If there is anything that I should clarify, please let me know, and I will provide as much details as I can.

Thanks!

Best Answer

A linear mixed model is what you want. First, make sure that Subject is a factor:

Mydata$Subject <- as.factor(Mydata$Subject)

Then, I would fit the model with saturated fixed- and random-effects structures:

mod1 <- lmer(Estimate ~ Condition + Size +  Condition * Size + # Fixed effects
               (1 + Condition + Size | Subject), # Random effects, nested within subject
             data=Mydata, REML=TRUE) # Specifying data and estimation

I know part of the formula is redundant, but I wanted to make it clear as possible to anyone reading in the future.

Note that the model with the fixed effects structure of (1 + Condition + Size + Condition*Size | Subject) is not identified and will not converge.

You want Condition specified as a random effect because this allows the variance at different levels of condition to be different, and Size lets the slope of Size be different across people.

Then I would do a top-down testing procedure. I would then set up identical models, but take out the random slopes one-by-one:

mod2 <- lmer(Estimate ~ Condition + Size +  Condition * Size + # Fixed effects
               (1 + Size | Subject), # Random effects, nested within subject
             data=Mydata, REML=TRUE) # Specifying data and estimation

mod3 <- lmer(Estimate ~ Condition + Size +  Condition * Size + # Fixed effects
               (1 + Condition | Subject), # Random effects, nested within subject
             data=Mydata, REML=TRUE) # Specifying data and estimation

anova(mod1,mod2,refit=FALSE) # tests for significance of condition random effecct
anova(mod1,mod3,refit=FALSE) # tests for significance of size random effecct

Note that you are estimating a random slope and covariance between slope and intercept at the same time with those two anova tests, so you will want to adjust your p-values accordingly.

If removing the slopes yields a significant difference, leave them in; if the fit is just as good when you take them out, then you can leave them out.

Then you can do the same with fixed effects, but you can specify REML=FALSE for those and there is no need to adjust the p-value. You can also use the t-tests from lmerTest; they generally give the same results.

It is unclear as to if Size is continuous or not. You say that it is, but then you talk about there being different "levels" of it and wanting to do post-hoc tests to compare it at different levels. If you are treating it as linear and continuous, then none of that matters; if you want to do a post-hoc test for a continuous moderator, you could look at simple slope analyses.

Related Solutions

Factorial Experiment – How to Deal With Repeated Measurements in the Same Condition of a Factorial Experiment

Multi-level models (aka mixed models etc) are designed to deal with the case where you have multiple measures on one person.

In the typical 2x2 design (not repeated measures) you have multiple observations in each cell, but these are on different and unrelated subjects (people or whatever), thus, they are independent, and ANOVA or regression (both are the general linear model) are fine (provided other assumptions are met).

If you have repeated measures on each subject, those data are not independent. There are various ways to deal with this. One way is to average the data for each person, but this isn't a very good way. Much better methods are multi-level models or general estimating equations (GEE).

Unfortunately, the terminology here can get very confusing. Better to write equations.

The general linear model (regular ANOVA or regression):

$Y = X\beta + \epsilon $

where Y is a vector of the dependent variable, X a matrix of independent variables, $\beta$ a vector of parameters to be estimated and $\epsilon$ is error. This assumes that

$\epsilon \sim \text{iid } \mathcal{N}(0, \sigma) $

Multi-level:

$Y = X\beta + Z\gamma + \epsilon$

where Z is the (known) design matrix and $\gamma$ is a vector of random effect parameters. Assumes $\gamma \sim \mathcal{N} (0, \sigma) $ and that the covariance between $\gamma$ and $\epsilon$ is 0.

Solved – Repeated-measures linear mixed effect model

okay should work out okay then-- so

Yes or you can use the lmer() and lme4. There is another one but I don't remember off the top of my head I think it is just lme?

2.You have a nested structure so yes you need (1|sample/participant)

Did you plot rating score vs stim.level to see evidence of a quadratic relationship? If not try plotting to see-- i you do see a quadratic pattern then yes you should add stim.level quadratic effect by

model <- lmer (rating.score ~ stim.level + I(stim.level^2) factor + stim.level*factor +(1|sample/participant) , mydata)

To reply to the comment

so if you are fitting a parabola and not a line you are fitting the generic

 y= a + b*x + c*x^2

so you need the linear and quadratic term so stim.level is the linear term and I(stim.level^2) would be the quadratic term. Try writing your model out on paper in equation form like

rating score = b_0 + b_1 * stim.level + b_2 *stim.level^2 + b.3*(stim.level*factor)

and if wanted you could have an interaction with the squared term but that might not be directly interpretable. 2) Remember you are fitting a parabola NOT a line http://biopt.ub.edu/_/rsrc/1257372769717/force-detection/equipartition-theorem/ex4%20X%20potential.jpg?height=315&width=420 something that looks like that. Did you plot stim.level against rating.score and see a quadratic relationship?

Best Answer

Related Solutions

Factorial Experiment – How to Deal With Repeated Measurements in the Same Condition of a Factorial Experiment

Solved – Repeated-measures linear mixed effect model

Related Question