Solved – split-split plot design with unbalanced repeated measures in lme4 or nlme (SAS translation)

I am sorry if this answer has been answered before but most answers here (e.g. here, or here ) do not really adress my issue (or maybe I just do not see correctly how they do.
I want to use a (linear) mixed effects analysis on my data. My design is quite straightforward:

I have two kinds of bacteria (M1 and M2) that are either paired with 8 other bacteria (S1) or not (S2).
In case of S1 quadruplicates bottles were measured and in case of S2 just triplicates.
The output variable is measured at multiple timepoints (repeated measures) for approx 72h.
Then a defined amount of biomass is taken to inoculate new bottles and so on for a total of 4 cycles (SB1 to 4). Hence the measurements of consecutive cycles are not independent (corelation).
The amount of timepoints per cycle is not always the same (unbalanced)

Ok, maybe not that straightforward, let's clarify with a picture:
Design
According to one of my colleagues with a lot of experience in mixed models this is a split-split plot design with repeated measures. Sadly, she is a SAS user and I would love to use R (lme or lmer) for the analysis.
However, I'm not here for the translation (that would be something for stack overflow I suppose, although if anyone here could answer it would be nice) but rather for understanding whether my design is indeed split-split with unbalanced repeated measures and how I can construct an adequate model to analyze it in terms of nested, fixed and random effects.

Here is wat I have so far with some dummy data (I try to use split-split plot nomenclature):

output <- rnorm(266)
mainplot <- c(rep("M1",133),rep("M2",133))
subplot <- c(rep("S1",76),rep("S2",57),
             rep("S1",76),rep("S2",57))
subsubplot <-c(rep("SB1",24),rep("SB2",16),rep("SB3",20),rep("SB4",16),
               rep("SB1",18),rep("SB2",12),rep("SB3",15),rep("SB4",12),
               rep("SB1",24),rep("SB2",16),rep("SB3",20),rep("SB4",16),
               rep("SB1",18),rep("SB2",12),rep("SB3",15),rep("SB4",12))
time<-rnorm(266)

mockset <- data.frame(output,mainplot,subplot,subsubplot,time)

mock.lmefit1 <- lme(output ~ 0 + mainplot * subplot * subsubplot, 
               data = mockset,
               random = ~ time|mainplot/subplot/subsubplot)

The mockset really represents my "unbalanced repeated measures" data structure. In the model above I do not account for my "autocorrelation" between subsubplots (i.e. Cycles SB1 through 4) nor do I correct explicitely for the "unbalancedness" of my repeated measures. I am not interested in my intercept.

My colleague suggested to use the following SAS code:

proc mixed data=rdata;
class mainplot subplot subsubplot time;
model output=  mainplot subplot
               subsubplot subplot*subsubplot
               time subplot*time
               subplot*subsubplot*time/ddfm=KENWARDROGER;
random mainplot*subplot mainplot*subplot*subsubplot;
lsmeans subsubplot subplot subplot*subsubplot
        time subplot*time subsubplot*time
        subplot*subsubplot*time;
run;

If I got it right the * in SAS is analogous to the R :, i.e. it shows the interaction between terms.
To me it is not exactly clear how to port this model from SAS to R. I do also not really get why specific effects are fixed here and others are random. One more thing that I do not have an option to give in into R is the ddfm (denominator degrees of freedom for the tests of fixed effects) option of SAS. What is the importance of this? Does anyone know how lme deals with this?

mock.lmefitSAS <- lme(output ~ mainplot + subplot +
                        subsubplot + subplot:subsubplot +
                        time + subplot:time + 
                        subsubplot:time +
                    subplot:subsubplot:time,
                data = mockset,
                random = ~ 1|mainplot:subplot + mainplot:subplot:subsubplot)

This code does not work, because this is not the proper way to give in interaction terms in lme I guess (? error is "invalid formula for groups") but as stated before I'm not here for a SAS-to-R translation…
I would just like to know if and why the SAS model is more correct given my design and what my R implementation is lacking.

Best Answer

The one practical thing I can tell you is that the denominator degrees-of-freedom business is available for lme4 models, using the pbkrtest package or various wrappers for it: see the ?pvalues man page from recent versions of lme4.

library("lme4")
options(contrasts=c("contr.sum","contr.poly"))
m1 <- lmer(output~0+mainplot*subplot*time +
 (time|mainplot:subplot:subsubplot),
           data=mockset)
library("car")
Anova(m1,type="3",test="F")

For the rest, I pretty much just have to agree with you. I'm a bit surprised by your colleague's specification -- I independently reconstructed your R-style syntax before I saw yours, and it doesn't make much sense to me to treat terms involving subsubplot (bottle) as fixed, or to treat mainplot*subplot as random (for both philosophical and practical, i.e. insufficient-replication, reasons). Perhaps she could chime in?

Best Answer

Related Solutions

Solved – Mixed-design with split-plot and mixed effect

Solved – ezAnova vs. lme for factorial repeated-measures design: results differ, why