# Mixed Models – How Are Random Effects Included in a Linear Mixed Model?

linear modelmixed modelmultiple regressionregression coefficients

I'm having difficulty in understanding the "process" that is going on behind how we are calculating all of our parameter estimates and how the random effects are used in our models.

To begin we can express the linear mixed model as:

$$y = X\beta + Zb + \epsilon \\b \sim N(0, \psi_\theta),\ \epsilon \sim N(0,\Lambda_\theta)$$

Where $$X\beta$$ would represent the fixed effects and $$Zb$$ would be representative of the random effects. How are both sets of effects being estimated? What I mean by this is I get that we will be using maximum likelihood methods to estimate the parameters formally. But what is the process? Are we estimating the fixed effects $$X\beta$$ and the random effects $$Zb$$ separately and then bring them together in our model?

I was playing around in R to try and understand more, but I'm still stick in making the leap. Here is an output I got from fitting a LMM to the iris data set:

Linear mixed model fit by REML ['lmerMod']
Formula: Petal.Width ~ Sepal.Width + Sepal.Length + (1 + Sepal.Length |      Species)
Data: iris

REML criterion at convergence: -66.7

Scaled residuals:
Min       1Q   Median       3Q      Max
-2.76049 -0.54295 -0.08282  0.55066  2.74867

Random effects:
Groups   Name         Variance Std.Dev. Corr
Species  (Intercept)  0.24352  0.4935
Sepal.Length 0.01377  0.1173   0.40
Residual              0.03091  0.1758
Number of obs: 150, groups:  Species, 3

Fixed effects:
Estimate Std. Error t value
(Intercept)   0.20829    0.33520   0.621
Sepal.Width   0.27272    0.05248   5.196
Sepal.Length  0.01796    0.07726   0.232

Correlation of Fixed Effects:
(Intr) Spl.Wd
Sepal.Width -0.116
Sepal.Lngth  0.134 -0.279
> coef(ir_lme2)
$Species (Intercept) Sepal.Width Sepal.Length setosa -0.1400153 0.2727164 -0.10956019 versicolor 0.0573940 0.2727164 0.08658932 virginica 0.7074925 0.2727164 0.07684240 attr(,"class") [1] "coef.mer  So I have fixed effects and random effects as can be seen at the bottom of the outputs. With regards to the random effects I get that we "group" our observations and then use those groupings to get group estimates of our parameters which are the random effects below. Were those random effects estimated in isolation and if so how? Same question with respect to the fixed effects. Another question is how are those random effects playing into the fixed effect estimates? Are the random effects contributing to the values we see in the fixed effects read outs? I read previous articles on the site about the ideas: What is the difference between fixed effect, random effect and mixed effect models? What is a difference between random effects-, fixed effects- and marginal model? I had also asked a previous question about this, but I might delete it because it is muddied in my confused understanding of the concept. As you can see my head is all over and I'm very confused about how things are being put together in this model. Any help to clarify things would be appreciated. Even in chat because I feel the things I'm not getting should be easy to clear up. EDIT: So I attempted to get some clarification from a TA. In my example let's say we end up with the expression: $$y = \beta_0 +\beta_{SW} \cdot SW + \beta_{SL} \cdot SL + (\alpha_1 + \alpha_2 \beta_{SL}SL)$$ where $$\beta_i$$ correspond to the fixed effects and $$\alpha$$ corresponds to the random effects, $$SW =$$ sepal.width, $$SL =$$ sepal.length, and $$\alpha =$$ random effect from Species. So if I understand this correctly for the random effect of random slope, we would group by species, take all of the Sepal.Length values by species (let's use setosa as a concrete example), compute an estimate for the variance, use this estimate for the variance in a normal distribution $$\alpha \sim N(0, \sigma_{setosa}^2)$$ from which we would draw a random value for Sepal.Length, and then this would serve as the random factor $$\alpha_2$$ which we would multiply by $$\beta_{SL}$$ to get our value for the random slope? Not looking for the precise mathematics yet, just an understanding. #### Best Answer You can think of mixed models as a two stage modeling approach. Firstly, you fit a model irrespective of the random effects; secondly you model the effect for each level of the grouping factors (random effects) via an approach known as partial pooling, see here and here for some more explanations and details. Finally, you adjust the fixed effects model based on the random effects. All of this happens together when running a mixed-effects model. Here is an example using the sleepstudy data in R: > m <- lmer(Reaction ~ Days + (1|Subject), data = sleepstudy) > fixef(m) (Intercept) Days 251.40510 10.46729  Which is the same as this: > m2 <- lm(Reaction ~ Days, data = sleepstudy) > coef(m2) (Intercept) Days 251.40510 10.46729  Going back to the lmer model, the random slopes estimates are: > ranef(m)$Subject
(Intercept)
308   40.783710
309  -77.849554
310  -63.108567
330    4.406442
331   10.216189
332    8.221238
333   16.500494
334   -2.996981
335  -45.282127
337   72.182686
349  -21.196249
350   14.111363
351   -7.862221
352   36.378425
369    7.036381
370   -6.362703
371   -3.294273
372   18.115747

with conditional variances for “Subject”


Now you adjust the average (fixed effect) intercept for each subject based on the estimated random effect, let's look at Subject 308 as an example:

$$251.40510 + 40.783710 = 292.1888$$

The result can also be checked by looking at coef(m) as well:

> coef(m)
$Subject (Intercept) Days 308 292.1888 10.46729 309 173.5556 10.46729 310 188.2965 10.46729 330 255.8115 10.46729 331 261.6213 10.46729 332 259.6263 10.46729 333 267.9056 10.46729 334 248.4081 10.46729 335 206.1230 10.46729 337 323.5878 10.46729 349 230.2089 10.46729 350 265.5165 10.46729 351 243.5429 10.46729 352 287.7835 10.46729 369 258.4415 10.46729 370 245.0424 10.46729 371 248.1108 10.46729 372 269.5209 10.46729  And now here's your example that also includes random slopes: > m3 <- lmer(Petal.Width ~ Sepal.Width + Sepal.Length + (1 + Sepal.Length |Species), data = iris) > fixef(m3) (Intercept) Sepal.Width Sepal.Length 0.20829042 0.27271644 0.01795717 > ranef(m3)$Species
(Intercept) Sepal.Length
setosa      -0.3483057  -0.12751737
versicolor  -0.1508964   0.06863214
virginica    0.4992021   0.05888522

with conditional variances for “Species”


Let's combine the fixed and random intercept and slopes together for setosa as an example:

Intercept:

$$0.20829042 - 0.3483057 = -0.1400153$$

Sepal.Width stays the same (no adjustment, i.e. not included in random effect terms):

$$0.27271644$$

Sepal.Length:

$$0.01795717 - 0.12751737 = -0.10956019$$

Let's check with:

> coef(m3)
\$Species
(Intercept) Sepal.Width Sepal.Length
setosa      -0.1400153   0.2727164  -0.10956019
versicolor   0.0573940   0.2727164   0.08658932
virginica    0.7074925   0.2727164   0.07684240


Here is a good explanation and an example that is not too complicated to follow and easy to understand: https://m-clark.github.io/mixed-models-with-R/random_intercepts.html#the-mixed-model

Another helpful link understanding complete pooling, no pooling and partial pooling: https://www.r-bloggers.com/2017/06/plotting-partial-pooling-in-mixed-effects-models/