Solved – Variance explained of a mixed effects model in a new data set

mixed modelrvalidationvariance

I'm using the following function to calculate Edwards R^2 (formula 19 in Edwards et al. 2008) of a mixed effects model (I hope the implementation is correct):

r2lmer <- function(model) {
   require(aod) # need the aod package for wald.test function
   if (class(model) != "mer") stop("mer object expected")
   n <- model@dims[['n']] # the number of observations
   p <- model@dims[['p']] # number of parameters
   df.mod <- n - p

   wald.model <- wald.test(b = fixef(model), Sigma = vcov(model), df = df.mod, Terms = 2:p)
   wald.F <- as.numeric(wald.model$result$Ftest[1])
   ( (p - 1) * 1 / df.mod * wald.F ) / (1 + (p - 1 ) * 1 / df.mod * wald.F )
}

In order to internally valdiate a mixed effects model, I would like to estimate the variance explained when the model is applied to a new data set, that is, when only the data changes, but not the parameter estimates. I'm not sure if it's even theoretically possible. Any help is highly appreciated.

Edit

If it's not a good idea to use an R^2 statistic for LMM, what other performance measures could one use to internally validate LMM? I have a LMM with just one random effects factor (varying intercept due to repeated measures) and several fixed effects factors. I'am primarily interested in the fixed effects and it is my understanding that Edward's R^2 is a good measure for the variance explained by the fixed effects. Edward's R^2 is also recommended for cross-validation in this paper:

Cheng et al. Real longitudinal data analysis for real people: Building a good enough mixed model. Statistics in Medicine. 2010, 29 504-520

Best Answer

I tend to be more and more convinced that this just generally isn't a good idea because the meaning the R^2 isn't really exactly the same as in a conventional linear regression. As such, one always runs into interpretation issues and it often distracts from the meat of the story. Time spent writing a good description of the fit of model is time better spent... but I could be wrong.

[A good answer to a question on how to write up the fit of an LMM model would be really handy (which a cursory search didn't show me).]