The correct lmer model formula for a Split Plot design

I have a standard split-plot design, having the three predictors:

A :: [ Categorical, Unordered, 2 Levels ] –> (1, 2)
B :: [ Categorical, Unordered, 4 Levels ] –> (1, 2, 3, 4)
C :: [ Categorical, Unordered, 2 Levels ] –> (1, 2)

Predictor B indicates the main plots, Predictor A indicates the treatment applied to entire main plots, and Predictor C indicates the treatment applied to the subplots within main plots.

The three crossed/nested pairwise relationships are:

A — C :: A is crossed with C
A -> B :: A nests B (i.e. B is nested within A)
B — C :: B is crossed with C

An example data set containing 8 measurements (but only showing the predictor information, not the measured responses) for this example experiment can be written in table format:

	A	B	C
1	1	1	1
2	1	1	2
3	1	2	1
4	1	2	2
5	2	3	1
6	2	3	2
7	2	4	1
8	2	4	2

I would like to model this design as a mixed model using lme4's lmer() function, but I am confused as to the correct model formula. I see two options…

Model 1

R ~ A*C + (1|B)

This model formula has the random effects design matrix:

…and a random effects covariance matrix:

Model 2

R ~ A*C + (1|A/B)

This model formula has the random effects design matrix:

…and a random effects covariance matrix:

Both models have the same fixed effect design matrix.

Model 2 seems to properly depict predictor B as nested within A, but it seems strange to put A as both a fixed effect and a random effect.

My question is: which of these two models is correct, and why?

# code based on https://www.muscardinus.be/2017/08/fixed-and-random/ library(tidyverse) library(lme4) n_a <- 2 n_b <- 2 # changed to fit your case n_sample <- 2 sd_A <- 2 sd_B <- 1 sd_noise <- 1 dataset <- expand.grid( B = paste0("b", seq_len(n_a * n_b)), sample = seq_len(n_sample) ) %>% mutate( A = paste0("a", as.integer(B) %% n_a) %>% factor(), mu = rnorm(n_a, sd = sd_A)[A] + rnorm(n_a * n_b, sd = sd_B)[B], Y = mu + rnorm(n(), sd = sd_noise) ) # add C dataset$C <- as.character(rep(0:1, each = 4)) dataset %>% arrange(A, B, C) %>% select(A,B,C) # none work well summary(lmer(Y ~ A*C + (1|B), data = dataset)) summary(lmer(Y ~ A+C + (1|B), data = dataset)) summary(lmer(Y ~ C + (1|A/B), data = dataset)) # alternative with differences dataset2 <- dataset %>% arrange(A, B, C) %>% group_by(A, B) %>% summarise(d = Y[1] - Y[2]) summary(lm(d ~ A, data = dataset2)) # compare summary(lm(Y ~ A*C, data = dataset)) # small n should have clear visual impact. This should be a counter example to that. ggplot(data = dataset, aes (x = C, y = Y, color = A, group = B)) + geom_path() + geom_point()

Best Answer

Ben Bolkers GLMM-FAQ is a reputable source and has quite bit to say about such problems. Specifically here: https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#when-can-i-include-a-predictor-as-both-fixed-and-random where it links to this blog-post: https://www.muscardinus.be/2017/08/fixed-and-random/ which clearly states that on should not use a categorical variable as both fixed and random.

So Model 1 is the answer in general, however with 8 observations and 4 fixed effect coefficients you probably won't be able to estimate the standard deviation of the random effects. See here: https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#singular-models-random-effect-variances-estimated-as-zero-or-correlations-estimated-as---1

I would just fit a model of the difference within each B with A as covariable to represent the interaction-coefficient A2:C2.

Here are some demonstration in R, however a lot of the factors start at 0 not 1:

Best Answer

Related Solutions

Related Question