Solved – Random vs fixed effects: why is the standard error the same for the slope but wildly different for the intercept

lme4-nlmemixed modelrstandard error

I've noticed that when I compare my results from fixed and mixed effect models, I get very similar standard errors for the slopes of (fixed-effect) covariates, whereas on the intercepts, the fixed effects model can have far far smaller standard errors (while the parameter estimates themselves are almost identical). Why is this?


EXAMPLE

For example I get these results (abridged here; full results at the bottom):

Mixed Model
              Estimate Std. Error t value
 (Intercept) 21.040772   1.921112   10.95
 x           -1.007014   0.009933 -101.38

Fixed Effects Model
              Estimate Std. Error t value Pr(>|t|)    
 (Intercept) 21.043157   0.055888  376.52   <2e-16 ***
 x           -1.007443   0.009934 -101.41   <2e-16 ***

For data that looks like this (different symbols for different levels of p):

enter image description here

From this code:

set.seed(1)    

generatepoint = function(ppt,n){
  x = rnorm(n)+ppt
  y = -x + 10 + 2*ppt + rnorm(n,sd=.1)
  cbind(x=x,y=y,p=ppt)
}

data = as.data.frame(do.call(rbind,lapply(1:10,generatepoint,10)))

plot(y~x,data=data,pch=data$p)

data$p = factor(data$p)

cat("\n\nMixed Model\n\n")
mod = lmer(y~x+(1|p),data=data)
print(summary(mod))

cat("\n\nFixed Model\n\n")
contrasts(data$p) <- contr.sum(10)
fmod = lm(y~x+p,data=data)
print(summary(fmod))

Here's the full output:

Mixed Model

Linear mixed model fit by REML ['lmerMod']
Formula: y ~ x + (1 | p)
Data: data

REML criterion at convergence: -88.6

Scaled residuals: 
  Min      1Q  Median      3Q     Max 
-2.6303 -0.6669 -0.1463  0.6713  2.2773 

Random effects:
  Groups   Name        Variance Std.Dev.
p        (Intercept) 36.87548 6.07252 
Residual              0.00807 0.08983 
Number of obs: 100, groups:  p, 10

Fixed effects:
  Estimate Std. Error t value
(Intercept) 21.040772   1.921112   10.95
x           -1.007014   0.009933 -101.38

Correlation of Fixed Effects:
  (Intr)
x -0.029


Fixed Model


Call:
  lm(formula = y ~ x + p, data = data)

Residuals:
  Min       1Q   Median       3Q      Max 
-0.23547 -0.06006 -0.01333  0.06017  0.20400 

Coefficients:
  Estimate Std. Error t value Pr(>|t|)    
(Intercept) 21.043157   0.055888  376.52   <2e-16 ***
  x           -1.007443   0.009934 -101.41   <2e-16 ***
  p1          -9.009845   0.051524 -174.87   <2e-16 ***
  p2          -7.017192   0.045469 -154.33   <2e-16 ***
  p3          -5.005483   0.036105 -138.64   <2e-16 ***
  p4          -3.034799   0.029087 -104.34   <2e-16 ***
  p5          -0.993759   0.027279  -36.43   <2e-16 ***
  p6           0.992106   0.028082   35.33   <2e-16 ***
  p7           2.966062   0.030515   97.20   <2e-16 ***
  p8           4.987105   0.032854  151.79   <2e-16 ***
  p9           7.100627   0.045236  156.97   <2e-16 ***
  ---
  Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.08983 on 89 degrees of freedom
Multiple R-squared:  0.9993,    Adjusted R-squared:  0.9992 
F-statistic: 1.194e+04 on 10 and 89 DF,  p-value: < 2.2e-16

Best Answer

OK, I think I understand this now. I figured one way to understand it might be through a parametric bootstrap of mod using bootMer as an alternative way of generating the standard errors. This function works by randomly sampling new datasets according to the parameters fitted in a given mixed model from lmer, then fitting a new model to that simulated data, to see how the parameter estimates vary for different samples from the same population. It offers one argument in particular, use.u, which allows the user to choose whether the random effects are resampled for each sample, or not. If we choose not to resample them, then we are simulating a world where the random effects come out at the same values each time -- sounds more like fixed effects? -- well, choosing that setting gives us the same standard errors as from the fixed effects model! On the other hand, choosing to simulate u, we get the same standard errors as from the mixed-effects model. (See below for code and output!)

Of course, this fits with the idea of a random effect as one that will come out with different levels each time, as opposed to a fixed effect that does not change (albeit my only confusion in answering this question is that that seems to be a definition that is refuted in some highly rated StackExchange answers such as this one). But that then makes sense of the standard errors presented in the question: if fixed effects are indeed, "fixed", then we are making inference in the context of those specific levels of the fixed effects: in the example in the question, in a repeat of the experiment, we would expect the dots to jiggle around within the bounds of the lines they are currently in, but those lines themselves would not move much. On the other hand, if the intercepts of those lines are random effects, then the lines should be expected to appear in different places in each new sample, and our idea of where the "middle" is, is far less clear: for all we know, this sample might have come out with all ten levels of the random effect being below the mean, for example, in which the case the true mean could even be above all the lines in the plot. On the other hand, the lines themselves are consistently pointing at a gradient of -1 and whether we view the heights of those lines as fixed or random, we see the same behaviour within a given line, so we should expect to draw the same inference around the gradient in both models.


Code and output:

> booted.simulate.u = bootMer(mod,fixef,nsim=10000,use.u=FALSE)
> booted.fix.u      = bootMer(mod,fixef,nsim=10000,use.u=TRUE )

> print(apply(booted.simulate.u$t,2,sd))
(Intercept)           x 
1.915025294 0.009903413 

> print(apply(booted.fix.u$t,2,sd))
(Intercept)           x 
0.055405505 0.009868006 
Related Question