Linear Mixed Models – Including Random Effects in Prediction with LMM

linearmixed modelmodelingpythonstatsmodels

I have a question regarding Linear Mixed Modeling using statsmodels.

The first picture below shows the mixed model I fitted. My dummy dataset only contains one variable, and multiple groups. I would like to predict using not only the fixed intercept and coefficient (see results Intercept and A), but also the random effects of the groups. It seems like it is possible in R, but not using Python (https://www.rdocumentation.org/packages/lme4/versions/1.1-23/topics/predict.merMod)

The second picture shows the predict functionality and some dummy data. As you can see, the prediction is a result of the a combination of only the fixed effects.

Or would it be weird to fit a linear mixed model and also use the random effects in prediction?

Thanks in advance!

enter image description here

enter image description here

Best Answer

There are situations where it would make sense to include the predicted random effects (BLUPs) in a prediction. If you have observations on a group (e.g. a person) and you want to make predictions about additional responses from the same group, then you would generally want to include the BLUP.

This is possible in statsmodels, as illustrated below. We should add an option to make this easier. For now, you have to do it manually, as shown below.

import pandas as pd
import statsmodels.api as sm
import numpy as np

# Simulate data for illustration
group_size = 5
n_groups = 100
x1 = np.random.normal(size=group_size*n_groups)
x2 = np.random.normal(size=group_size*n_groups)
u = np.kron(np.random.normal(size=n_groups), np.ones(group_size))
g = np.kron(np.arange(n_groups), np.ones(group_size))
e = np.random.normal(size=group_size*n_groups)
y = x1 - x2 + u + e

# Fit a multilevel model
df = pd.DataFrame({"y":y, "x1":x1, "x2":x2, "g":g})
model = sm.MixedLM.from_formula("y ~ x1 + x2", groups="g", data=df)
result = model.fit()

# The BLUPs
re = result.random_effects

# Multiply each BLUP by the random effects design matrix for one group
rex = [np.dot(model.exog_re_li[j], re[k]) for (j, k) in enumerate(model.group_labels)]

# Add the fixed and random terms to get the overall prediction
rex = np.concatenate(rex)
yp = result.fittedvalues + rex