Solved – linear mixed effects models – overfit: how to calculate predictive R squared

mixed modelrr-squared

I am using R to build the random structure of my model but I am ending up with a very complex model. Currently looks like this:

Model <- lmer(x ~ y * z * d * k + (1 + y * z + d | subject), 
              data = Data, 
              REML = FALSE, 
              control = lmerControl(optimizer = "bobyqa", 
                        optCtrl = list(maxfun = 100000)))

I would like to know if I am simply overfitting. How can I get a predictive R-squared for linear mixed effects models? Is there a way to calculate these values?

I am aware of the package MuMIn for getting Rsquared values but I am concerned with overfitting, so I wanted to see if the degrees of freedom are biasing too much the AIC and p-values when comparing the models using anova.

Best Answer

This is an interesting question... The approach for calculating predictive r-square for linear models is given at this rpubs page. It won't work directly for mixed models. There is an influence function in the car package for mixed models, but I couldn't figure out how to adapt this to the purpose.... If I understand predictive r-squared, probably the most fruitful approach would be to write a function that removes data observation by observation and sees how well the model predicts the dropped observation. I didn't see anything that addresses how this would be done specifically.