Solved – Reporting results of linear mixed-effects model

mixed modelrepeated measuresreporting

Linear mixed-effects models aren't commonly used in my corner of biology, and I need to report the statistical test I used in a paper I'm trying to write. I know that awareness of multilevel modeling is starting to appear in some areas of the biosciences (A solution to dependency: using multilevel analysis to accommodate nested data), but I'm still trying to learn how to report my results!

My experimental design, in brief:
*Subjects were assigned to one of four treatment groups
*Measurements of the dependent variable were taken on various days after the start of treatment
*The design is unbalanced (unequal numbers of subjects in the treatment groups, and missing measurements for some subjects on some days)
*Treatment A is the reference category
*I centered the data at the final day of treatment

I want to know whether Treatment A (the reference category) yields significantly better results than the other treatments (at the end of treatment).

I did my analysis in R, using nlme:

mymodel <- lme(dv ~ Treatment*Day, random = ~1|Subject, data = mydf, na.action = na.omit, 
+ correlation = corAR1(form = ~1 |Subject), method = "REML")

And the output (in part; truncated for brevity) is:

>anova(mymodel)
              numDF denDF  F-value p-value
(Intercept)      1   222 36173.09  <.0001
Treat            3    35    16.61  <.0001
Day              7   222     3.43  0.0016
Treat:Day       21   222     3.62  <.0001

>summary(mymodel)
Fixed effects: dv ~ Treatment * Day 
                       Value Std.Error  DF  t-value p-value
(Intercept)         7.038028 0.1245901 222 56.48945  0.0000
TreatmentB          0.440560 0.1608452  35  2.73903  0.0096
TreatmentC          0.510214 0.1761970  35  2.89570  0.0065
TreatmentD          0.106202 0.1637436  35  0.64859  0.5208

So, I know that the effect of Day differs by Treatment, and that, on the final day of treatment (where the data are centered), dv is significantly different in Treatment A than Treatments B or C.

What I want to say is: "As predicted, we found that Dependent Variable was significantly lower in subjects receiving Treatment A (mean +/- SE) than in subjects receiving Treatment B (mean +/- SE, p=0.0096) or Treatment C (mean +/- SE, p=0.0065), as measured on the final day of treatment."

But, I have to indicate what statistical test was done. Would this be an acceptable way to describe the analysis? "[Measurement Method] was performed on days indicated and Dependent Variable (units) was determined; we analyzed the log-transformed data using a linear mixed-effects model centered at [final day of treatment]. Symbols represent mean dv; error bars are standard error. On the final day of treatment, dv was significantly lower in Treatment A (mean +/- SE) than Treatment B (mean +/- SE, p=0.0096)…"

Specifically,
*Does that say enough about the statistical test used? (Readers are used to seeing something more like "mean +/- SE, p=0.0096, Student's t-test," but it seems weird to write "p=0.0096, coefficient for Treatment B vs. Treatment A from linear mixed-effects model at [final day of treatment].")
*Is there a better way to put this?

(The methods section will include more information about the stats: "[Measurement Method] data were analyzed using R and the R packages… We analyzed the log-transformed Dependent Variable data by using linear mixed effects models using Subjects as random effects and an autocorrelation structure of order 1 (AR1). As fixed effects, we included Treatment and Day, and the interaction of Treatment and Day. We checked for normality and homogeneity by visual inspections of plots of residuals against fitted values. To assess the validity of the mixed effects analyses, we performed likelihood ratio tests comparing the models with fixed effects to the null models with only the random effects.")

Any advice on how to report results of a linear mixed effects model for an often stats-averse audience (and written by a relative stats novice) would be greatly appreciated!

Best Answer

This may not help answer your question, but I noticed that you have a repeated measure (Day) in your experiment, but you did not indicate that this was a repeated measure in your model. I would have thought the random term in your model to be as such:

mymodel <- lme(dv ~ Treatment*Day, random = ~1|Subject/Day, 
               data = mydf, na.action = na.omit,
               correlation = corAR1(form = ~1 |Subject/Day), method = "REML")

As for reporting the results, did you intend to report on the day on which you start seeing significant differences between the treatments? If so, then I think you'll need to look at/report on the contrasts on the interaction term as well. I'm a stats novice myself and basically have the same question as you do :-)

Andy Field's "Discovering Statistics Using R" explains how to report results from a linear mixed effects model in Ch14. I don't have the book at hand but can edit this post once I get my hands on it again.

Related Solutions

Solved – Mixed effects model or mixed design ANOVA in R

So I've done a lot of reading and chatting to people and I have a solution.

My experimental design is a split plot design, which is quite different from a nested or hierarchical design. I was originally confusing the terms. As Robert correctly states in his answer, what is needed is a mixed effects model. Thus:

Fixed effects: Year, Treatment1, Treatment2

Random effects: Year, Block, Treatment1

The model is specified thus:

mod<- lmer(Richness~Treatment1*Treatment2*Year+(1|Block/Treatment1)+(1|Year),data=dat,poisson)

The fixed effects are the terms specified in the brackets. Since none of these are continuous (the effect of Year doesn't necessarily increase each year in a linear fashion so I have classed it as a categorical fixed effect), they are specified 1|fixed effect, where 1 represents the intercept.

If Block were actually a continuous fixed effect (obviously hypothetical!) then the fixed effects might be specified +(Block|Treatment1)+(1|Year).

The model can then be simplified as appropriate.

Several things to note:

1) When specified as a random effect, Year is listed separately from Block and Treatment1, since it doesn't have an intuitive "level" at which to be nested between them (Year isn't any different at any plot size of the experiment: for every block, plot and subplot Year is the same.

2) Treatment 2 does not need to be specified as a random effect since it represents the highest level of replication in the experiment and therefore will not be psuedoreplicated

3) In mixed effects models it is possible to specify an error distribution if errors are not normal. I have specified poisson here, since my response data are counts - this improved the distribution of the model residuals.

Solved – Maximal model for linear mixed-effects model for repeated mesaures design

The maximal structure would need to include also a random effect for the interaction between color and shape, that is:

Y ~ color * shape + (color + shape + color:shape | subject)

This will result in all your predictors (color, shape and their interaction) having a fixed effect (constant for all subjects), and a random effect (individual fluctuations around the estimated fixed effect). In this sense the model is the maximal one. Note that it might not be fully equivalent to a repeated-measures ANOVA as it doesn't make equally strict assumptions on the correlational structure (see Tom's answer).

If you don't include the interaction in the random effect part of the formula, individual variation in the interaction effect will not be considered as "random", and the model will not be equivalent to a repeated-measures ANOVA. Of course, the variance of the random deviates for the interaction (or any other random effect) might be so small that including it in the model do not improve much the fit. You can check this not only with the AIC, but with a likelihood ratio test, as model with vs without one random effect are nested one another. In principle if the likelihood ratio test is not significant, it means that you can safely remove that random effect. Simplifying the random effect structures by removing negligible components would be an example of what in the article you linked is called data-driven approach.

You can simplify the model in this way, and it would still be equivalent to a repeated-measures ANOVA:

Y ~ color*shape + (1|subject) + (0+color|subject) + (0+shape|subject) + (0+color:shape|subject)

This syntax tells lmer to not estimate the correlations of random deviates across subjects. The drawback here is that, for example, you won't be able to tell whether subjects that have a large effect of color tend to have also a larger effect of shape (or smaller effect, in case of negative correlation).

You can easily include a between-subjects predictor, the only difference is that you can't add a random effect for it. "gender" for example cannot have a random effect grouped according to subject, but it can interact with the other fixed effects, e.g.:

Y ~ color * shape * gender + (color + shape + color:shape | subject)

Best Answer

Related Solutions

Solved – Mixed effects model or mixed design ANOVA in R

Solved – Maximal model for linear mixed-effects model for repeated mesaures design

Related Question