Solved – What are the fitted values in a random effects model

mixed modelrregression

Consider a linear random intercept model:

\begin{align}
y_{ij} &= A_{i} + \varepsilon_{ij} \\
A_{i} &\sim N(0,\tau^2) \\
\varepsilon_{ij} &\sim N(0,\sigma^2)
\end{align}

where, $A_i$ and $\varepsilon_{ij}$ are iid and independent of each other. I would fit this in R with something like lmer(y ~ (1|id)), where id is the group index ($i$ in the previous sentence). What are the fitted values I get by calling fitted() on the object returned by lmer?

I have heard it said (perhaps the speaker was speaking loosely) that making an effect random instead of fixed doesn't affect the point estimate, just its variance. Well if I used a fixed intercept above I would expect OLS to give the group means as the fitted values, which is not what I get when I run fitted on the random effects model. Also the remark I quoted doesn't seem quite right, because in the fixed effects case you would be fitting a term for each group whereas in the random effects case you would just have 1 df, right, the variance of the group effect? Is that right? Does the random effect likelihood equation integrate out the intercept values so just the group effect variance needs to be estimated? If my understanding is correct, and intercept terms are never estimated, how does one interpret fitted values in a random effects model?

Best Answer

Your model does fit a population mean intercept. It also fits a variance for the population distribution. From the data and these two, random intercepts are predicted for each study unit. (See my answer here: Why do the estimated values from a Best Linear Unbiased Predictor (BLUP) differ from a Best Linear Unbiased Estimator (BLUE)?) In your (relatively straightforward) model, the fitted values are the sum of the estimated fixed effect coefficient and the predicted random intercept. Put another way, the values are the predicted values ($\hat y_i$) for the study units based on your model, knowing both the estimated fixed and predicted random effects.

Here is a quick demonstration using the code from my linked answer:

cbind(fitted(re.mod3), 
      rep(coef(summary(re.mod3))[1]+ranef(re.mod3)[[1]][[1]], each=5))
#        [,1]     [,2]
# 1  13.19965 13.19965
# 2  13.19965 13.19965
# 3  13.19965 13.19965
# 4  13.19965 13.19965
# 5  13.19965 13.19965
# 6  16.31164 16.31164
# 7  16.31164 16.31164
# 8  16.31164 16.31164
# 9  16.31164 16.31164
# 10 16.31164 16.31164
# 11 17.47962 17.47962
# 12 17.47962 17.47962
# 13 17.47962 17.47962
# 14 17.47962 17.47962
# 15 17.47962 17.47962
# 16 15.49802 15.49802
# 17 15.49802 15.49802
# 18 15.49802 15.49802
# 19 15.49802 15.49802
# 20 15.49802 15.49802
# 21 13.82224 13.82224
# 22 13.82224 13.82224
# 23 13.82224 13.82224
# 24 13.82224 13.82224
# 25 13.82224 13.82224