R Software – Using the predict() Function for lmer Mixed Effects Models

lme4-nlmemixed modelr

The problem:

I have read in other posts that predict is not available for mixed effects lmer {lme4} models in [R].

I tried exploring this subject with a toy dataset…

Background:

The dataset is adapted form this source, and available as…

require(gsheet)
data <- read.csv(text = 
     gsheet2text('https://docs.google.com/spreadsheets/d/1QgtDcGJebyfW7TJsB8n6rAmsyAnlz1xkT3RuPFICTdk/edit?usp=sharing',
        format ='csv'))

These are the first rows and headers:

> head(data)
  Subject Auditorium Education Time  Emotion Caffeine Recall
1     Jim          A        HS    0 Negative       95 125.80
2     Jim          A        HS    0  Neutral       86 123.60
3     Jim          A        HS    0 Positive      180 204.00
4     Jim          A        HS    1 Negative      200  95.72
5     Jim          A        HS    1  Neutral       40  75.80
6     Jim          A        HS    1 Positive       30  84.56

We have some repeated observations (Time) of a continuous measurement, namely the Recall rate of some words, and several explanatory variables, including random effects (Auditorium where the test took place; Subject name); and fixed effects, such as Education, Emotion (the emotional connotation of the word to remember), or $\small \text{mgs.}$ of Caffeine ingested prior to the test.

The idea is that it's easy to remember for hyper-caffeinated wired subjects, but the ability decreases over time, perhaps due to tiredness. Words with negative connotation are more difficult to remember. Education has a predictable effect, and even the auditorium plays a role (perhaps one was more noisy, or less comfortable). Here're a couple of exploratory plots:

Differences in recall rate as a function of Emotional Tone, Auditorium and Education:

When fitting lines on the data cloud for the call:

fit1 <- lmer(Recall ~ (1|Subject) + Caffeine, data = data)

I get this plot:

library(ggplot2)
p <- ggplot(data, aes(x = Caffeine, y = Recall, colour = Subject)) +
  geom_point(size=3) +
  geom_line(aes(y = predict(fit1)),size=1) 
print(p)

while the following model:

fit2 <- lmer(Recall ~ (1|Subject/Time) + Caffeine, data = data)

incorporating Time and a parallel code gets a surprising plot:

p <- ggplot(data, aes(x = Caffeine, y = Recall, colour = Subject)) +
  geom_point(size=3) +
  geom_line(aes(y = predict(fit2)),size=1) 
print(p)

The question:

How does the predict function operate in this lmer model? Evidently it's taking into consideration the Time variable, resulting in a much tighter fit, and the zig-zagging that is trying to display this third dimension of Time portrayed in the first plot.

If I call predict(fit2) I get 132.45609 for the first entry, which corresponds to the first point. Here is the head of the dataset with the output of predict(fit2) attached as the last column:

> data$predict = predict(fit2)
> head(data)
  Subject Auditorium Education Time  Emotion Caffeine Recall   predict
1     Jim          A        HS    0 Negative       95 125.80 132.45609
2     Jim          A        HS    0  Neutral       86 123.60 130.55145
3     Jim          A        HS    0 Positive      180 204.00 150.44439
4     Jim          A        HS    1 Negative      200  95.72 112.37045
5     Jim          A        HS    1  Neutral       40  75.80  78.51012
6     Jim          A        HS    1 Positive       30  84.56  76.39385

The coefficients for fit2 are:

$`Time:Subject`
         (Intercept)  Caffeine
0:Jason     75.03040 0.2116271
0:Jim       94.96442 0.2116271
0:Ron       58.72037 0.2116271
0:Tina      70.81225 0.2116271
0:Victor    86.31101 0.2116271
1:Jason     59.85016 0.2116271
1:Jim       52.65793 0.2116271
1:Ron       57.48987 0.2116271
1:Tina      68.43393 0.2116271
1:Victor    79.18386 0.2116271
2:Jason     43.71483 0.2116271
2:Jim       42.08250 0.2116271
2:Ron       58.44521 0.2116271
2:Tina      44.73748 0.2116271
2:Victor    36.33979 0.2116271

$Subject
       (Intercept)  Caffeine
Jason     30.40435 0.2116271
Jim       79.30537 0.2116271
Ron       13.06175 0.2116271
Tina      54.12216 0.2116271
Victor   132.69770 0.2116271

My best bet was…

> coef(fit2)[[1]][2,1]
[1] 94.96442
> coef(fit2)[[2]][2,1]
[1] 79.30537
> coef(fit2)[[1]][2,2]
[1] 0.2116271
> data$Caffeine[1]
[1] 95
> coef(fit2)[[1]][2,1] + coef(fit2)[[2]][2,1] + coef(fit2)[[1]][2,2] * data$Caffeine[1]
[1] 194.3744

What is the formula to get instead to 132.45609?

EDIT for rapid access… The formula to calculate the predicted value (according to the accepted answer would be based on the ranef(fit2) output:

> ranef(fit2)
$`Time:Subject`
         (Intercept)
0:Jason    13.112130
0:Jim      33.046151
0:Ron      -3.197895
0:Tina      8.893985
0:Victor   24.392738
1:Jason    -2.068105
1:Jim      -9.260334
1:Ron      -4.428399
1:Tina      6.515667
1:Victor   17.265589
2:Jason   -18.203436
2:Jim     -19.835771
2:Ron      -3.473053
2:Tina    -17.180791
2:Victor  -25.578477

$Subject
       (Intercept)
Jason   -31.513915
Jim      17.387103
Ron     -48.856516
Tina     -7.796104
Victor   70.779432

… for the first entry point:

> summary(fit2)$coef[1]
[1] 61.91827             # Overall intercept for Fixed Effects 
> ranef(fit2)[[1]][2,]   
[1] 33.04615             # Time:Subject random intercept for Jim
> ranef(fit2)[[2]][2,]
[1] 17.3871              # Subject random intercept for Jim
> summary(fit2)$coef[2]
[1] 0.2116271            # Fixed effect slope
> data$Caffeine[1]
[1] 95                   # Value of caffeine

summary(fit2)$coef[1] + ranef(fit2)[[1]][2,] + ranef(fit2)[[2]][2,] + 
                    summary(fit2)$coef[2] * data$Caffeine[1]
[1] 132.4561

The code for this post is here.

Best Answer

It's easy to get confused by the presentation of coefficients when you call coef(fit2). Look at the summary of fit2:

> summary(fit2)
Linear mixed model fit by REML ['lmerMod']
Formula: Recall ~ (1 | Subject/Time) + Caffeine
   Data: data

REML criterion at convergence: 444.5

Scaled residuals: 
 Min       1Q   Median       3Q      Max 
-1.88657 -0.46382 -0.06054  0.31430  2.16244 

Random effects:
 Groups       Name        Variance Std.Dev.
 Time:Subject (Intercept)  558.4   23.63   
 Subject      (Intercept) 2458.0   49.58   
 Residual                  675.0   25.98   
Number of obs: 45, groups:  Time:Subject, 15; Subject, 5

Fixed effects:
Estimate Std. Error t value
(Intercept) 61.91827   25.04930   2.472
Caffeine     0.21163    0.07439   2.845

Correlation of Fixed Effects:
 (Intr)
Caffeine -0.365

There is an overall intercept of 61.92 for the model, with a caffeine coefficient of 0.212. So for caffeine = 95 you predict an average 82.06 recall.

Instead of using coef, use ranef to get the difference of each random-effect intercept from the mean intercept at the next higher level of nesting:

> ranef(fit2)
$`Time:Subject`
         (Intercept)
0:Jason    13.112130
0:Jim      33.046151
0:Ron      -3.197895
0:Tina      8.893985
0:Victor   24.392738
1:Jason    -2.068105
1:Jim      -9.260334
1:Ron      -4.428399
1:Tina      6.515667
1:Victor   17.265589
2:Jason   -18.203436
2:Jim     -19.835771
2:Ron      -3.473053
2:Tina    -17.180791
2:Victor  -25.578477

$Subject
       (Intercept)
Jason   -31.513915
Jim      17.387103
Ron     -48.856516
Tina     -7.796104
Victor   70.779432

The values for Jim at Time=0 will differ from that average value of 82.06 by the sum of both his Subject and his Time:Subject coefficients:

$$82.06+17.39+33.04=132.49$$

which I think is within rounding error of 132.46.

The intercept values returned by coef seem to represent the overall intercept plus the Subject or Time:Subject specific differences, so it's harder to work with those; if you tried to do the above calculation with the coef values you would be double-counting the overall intercept.

Related Solutions

Solved – Properly specifying mixed effects model in lmer

By nesting country within wave, you are cutting the connection between the repeated measurements within the same country. I would just use crossed random effects:

 (1|country) + (1|wave) + (1|country:wave)

Solved – Model Matrices for Mixed Effects Models

Creating the $J_i$ matrix entails producing 3 levels (309, 330 and 371) each one with 10 observations or measurements (nrow(sleepstudy[sleepstudy$Subject==309,]) [1] 10). Following the code in the original link in the OP:

f <- gl(3,10) Ji<-t(as(f,Class="sparseMatrix"))

enter image description here

Building the $X_i$ matrix can be aided by using the function getME as a reference:

library(lme4) sleepstudy <- sleepstudy[sleepstudy$Subject %in% c(309, 330, 371), ] rownames(sleepstudy) <- NULL fm1<-lmer(Reaction~Days+(Days|Subject), sleepstudy)

Xi <- getME(fm1,"mmList")

enter image description here

Since we will need the transpose, and the object Xi is not a matrix, the t(Xi) can be built as:

t_Xi <- rbind(c(rep(1,30)),c(rep(0:9,3)))

$Z_i$ is calculated as $Z_i= (J_i^{T}∗X_i^{T})^⊤$:

Zi<-t(KhatriRao(t_Ji,t_Xi)):

enter image description here

This corresponds to equation (6) in the original paper:

$$Z_i = (J_i^T*X_i^T)^T=\begin{bmatrix}J_{i1}^T\otimes X_{i1}^T\\J_{i2}^T\otimes X_{i2}^T\\\vdots\\J_{in}^T\otimes X_{in}^T\end{bmatrix}$$

And to see this we can instead play with truncated $J_i^T$ and $X_i^T$ matrices by imagining that instead of 9 measurements and a baseline (0), there is only 1 measurement (and a baseline). The resulting matrices would be:

$J_i^T=\left[\begin{smallmatrix}1&1&0&0&0&0\\0&0&1&1&0&0\\0&0&0&0&1&1\end{smallmatrix}\right]$ and $X_i^T=\left[\begin{smallmatrix}1&1&1&1&1&1\\0&1&0&1&0&1\end{smallmatrix}\right]$.

And

$J_i^T*X_i^T=\left[ \begin{smallmatrix} \left(\begin{smallmatrix}1\\0\\0\end{smallmatrix}\right)\otimes\left(\begin{smallmatrix}1\\0\end{smallmatrix}\right)&& \left(\begin{smallmatrix}1\\0\\0\end{smallmatrix}\right)\otimes\left(\begin{smallmatrix}1\\1\end{smallmatrix}\right)&& \left(\begin{smallmatrix}0\\1\\0\end{smallmatrix}\right)\otimes\left(\begin{smallmatrix}1\\0\end{smallmatrix}\right)&& \left(\begin{smallmatrix}0\\1\\0\end{smallmatrix}\right)\otimes\left(\begin{smallmatrix}1\\1\end{smallmatrix}\right)&& \left(\begin{smallmatrix}0\\0\\1\end{smallmatrix}\right)\otimes\left(\begin{smallmatrix}1\\0\end{smallmatrix}\right)&& \left(\begin{smallmatrix}0\\0\\1\end{smallmatrix}\right)\otimes\left(\begin{smallmatrix}1\\1\end{smallmatrix}\right) \end{smallmatrix} \right]$

$ \small= \begin{bmatrix}J_{i1}^T\otimes X_{i1}^T && J_{i2}^T \otimes X_{i2}^T && J_{i3}^T\otimes X_{i3}^T && J_{i4}^T\otimes X_{i4}^T && J_{i5}^T\otimes X_{i5}^T && J_{i6}^T\otimes X_{i6}^T\end{bmatrix}$

$=\left[\begin{smallmatrix}1&1&0&0&0&0\\0&1&0&0&0&0\\0&0&1&1&0&0\\0&0&0&1&0&0\\0&0&0&0&1&1\\0&0&0&0&0&1\end{smallmatrix}\right]$. Which transposed and extended would result in $Z_i=\left[\begin{smallmatrix}1&0&0&0&0&0\\1&1&0&0&0&0\\1&2&0&0&0&0\\\vdots\\\\0&0&1&0&0&0\\0&0&1&1&0&0\\0&0&1&2&0&0\\\vdots\\\\0&0&0&0&1&0\\0&0&0&0&1&1\\0&0&0&0&1&2\\\\\vdots\end{smallmatrix}\right]$.

Extracting the $b$ vector of random effects coefficients can be done with the function:

b <- getME(fm1,"b")

[1,] -44.1573839
[2,]  -2.4118590
[3,]  32.8633489
[4,]  -0.3998801
[5,]  11.2940350
[6,]   2.8117392

If we add these values to the fixed-effects of the call fm1<-lmer(Reaction~Days+(Days|Subject), sleepstudy) we get the intercepts:

205.3016 for 309; 282.3223 for 330; and 260.7530 for 371

and the slopes:

2.407141 for 309; 4.419120 for 330; and 7.630739 for 371

values consistent with:

library(lattice)
xyplot(Reaction ~ Days | Subject, groups = Subject, data = sleepstudy, 
       pch=19, lwd=2, type=c('p','r'))

enter image description here

$Zb$ can be calculated as as.matrix(Zi)%*%b.

Best Answer

Related Solutions

Solved – Properly specifying mixed effects model in lmer

Solved – Model Matrices for Mixed Effects Models

Related Question