Mixed Model Standard Error – Standard Errors in LME4 Linear Mixed Models

lme4-nlmemixed modelstandard error

I'm trying to understand how standard errors for the parameter estimates are calculated in linear mixed models, and why I don't get the same output with different methods. I've made the following example for a simple linear mixed model using package lme4:

library("lme4")
library("lmerTest")
library("effect")
library("emmeans")

response <- c(33,85,77,43,93,87,24,81,65,56,74,96,47,57,94)
ind <- c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5)
treatment <- c("A","B","C","A","B","C","A","B","C","A","B","C","A","B","C")

df <- data.frame(response, ind, treatment)

mod <- lmer(response ~ treatment + (1 | ind), data = df)

summary(mod)

as.data.frame(effect("treatment", mod))
emmeans(mod, spec = c("treatment"))

summary(mod) produces the following output, where we get the standard errors (for the fixed effects):

Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: response ~ treatment + (1 | ind)
   Data: df

REML criterion at convergence: 100.4

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.6290 -0.5492  0.2168  0.6793  1.1625 

Random effects:
 Groups   Name        Variance Std.Dev.
 ind      (Intercept)   3.551   1.884  
 Residual             164.783  12.837  
Number of obs: 15, groups:  ind, 5

Fixed effects:
            Estimate Std. Error     df t value Pr(>|t|)    
(Intercept)   40.600      5.802 11.989   6.997 1.45e-05 ***
treatmentB    37.400      8.119  8.000   4.607  0.00174 ** 
treatmentC    43.200      8.119  8.000   5.321  0.00071 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
           (Intr) trtmnB
treatmentB -0.700       
treatmentC -0.700  0.500

We can also get standard errors (and confidence intervals) from e.g. the effects and emmeans packages (which produce the same output), and for as.data.frame(effect("treatment", mod)) it looks like this:

treatment  fit       se    lower    upper
1         A 40.6 5.802299 27.95788 53.24212
2         B 78.0 5.802299 65.35788 90.64212
3         C 83.8 5.802299 71.15788 96.44212

The Estimate/fit produces identical values (just with the difference that they are already summed in the effect("treatment", mod) output). For the standard errors, we get the same value for intercept/treatment A (5.80), but different values for treatment B and C (8.12 and 5.80). I'm not too familiar with the details of mixed models, and I might miss something obvious here, but I don't understand why this is the case. My questions are (1) how are standard errors for the parameters calculated in linear mixed models, and (2) why does summary(mod) and effect("treatment", mod) give different values, and (3) which one would be more "correct" to report?

Best Answer

By default in R, treatment contrasts are used for factors. This means that what you get in the output from summary(mod) are the differences from the reference level for treatment. E.g., 37.4 is the difference between treatment B and treatment A.

If you want to get the mean for treatment B, you will need to add the coefficients. For the standard errors, you also need to account for the covariance between the estimates of the fixed effects. The following code illustrates how this is done (which essentially what effects and emmeans do under the hood):

coefs <- fixef(mod)
V <- vcov(mod)

# mean and std. error for treatment B
DF <- data.frame(treatment = factor("B", levels = LETTERS[1:3]))
X <- model.matrix(~ treatment, data = DF)
c(X %*% coefs)
sqrt(diag(X %*% V %*% t(X)))


# mean and std. error for treatment C
DF <- data.frame(treatment = factor("C", levels = LETTERS[1:3]))
X <- model.matrix(~ treatment, data = DF)
c(X %*% coefs)
sqrt(diag(X %*% V %*% t(X)))

Related Solutions

Solved – linear regression vs linear mixed effect model coefficients

I don't know that I can give a rigorous theoretical explanation, but a picture may make things clearer:

The blue line is the OLS fit, the gray line is the population-level prediction for the mixed model. The individual lines are predicted lines (all equal slopes, randomly varying intercepts) for each ID.
Since there is some correlation between the mean values of X and Y for each group, some of the variability that would go into the slope is instead taken out by the random intercept term.
The apparently large difference in the intercepts is partly caused by extrapolation (the data starts at X=2, the intercept refers to the expected value at X=0).

d <- data.frame(ID=factor(rep(1:20,each=3)),
                Y=c(1,2,3,5,4,6,7,8,9,2,3,4,5,5,6,7,6,
                    8,3,4,2,1,2,
                    1,5,6,4,7,8,9,8,8,7,6,4,
                    2,4,5,6,6,7,5,3,4,2,1,2,
                    3,4,2,3,5,6,4,7,8,6,9,8,9),
                X=c(3,4,3,6,4,6,6,8,5.5,4,3,5.5,5,7,5.5,7,4.5,6,4,
                    3,4,2.5,4,3,6,6,6.5,7,8,7,7,5.5,6,6.5,4,4,3.5,
                    5,4,5.5,7,4.5,4.5,6,5.5,2,3,6,3,4.5,3,5,6,3,
                    7.5,7.5,5.5,6.5,7,6))

lm1 <- lm(Y ~ X, data = d)
library(lme4)
lmer1 <- lmer(Y ~ X + (1 | ID), data = d)
ff <- fixef(lmer1)
## get predictions
pp <- d
pp$Y <- predict(lmer1)
library(dplyr)
pp <- pp %>%
    group_by(ID) %>%
    filter(Y %in% range(Y))

library(ggplot2); theme_set(theme_bw())
ggplot(d,aes(X,Y,colour=ID))+
    geom_point()+
    scale_colour_discrete(guide=FALSE)+
    geom_line(data=pp)+
    scale_x_continuous(limits=c(0,8))+
    geom_smooth(method="lm",aes(group=1),fullrange=TRUE)+
    geom_abline(slope=ff["X"],intercept=ff["(Intercept)"],
                colour="darkgray",lwd=1.5)
ggsave("CV161703.png")

Solved – emmeans pairwise contrasts result in same output values for all

You have fitted an additive model - the fixed-effects part is condition + location. Therefore you have in fact specified that the differences for one factor are exactly the same at each level of the other factor. Since emmeans() summarizes a model, then, lo and behold, the results reflect what is specified.

If instead you include the interaction between condition and location in the model, then the emmeans() results will reflect the possibility that factor levels compare differently at levels of the other factor.

I recommend that people think more carefully about the models they are fitting. I think there is a tendency to rush forward without realizing what s crucial thing it is to get the model right.

Best Answer

Related Solutions

Solved – linear regression vs linear mixed effect model coefficients

Solved – emmeans pairwise contrasts result in same output values for all

Related Question