Solved – Interpreting correlation from two linear mixed-effect models

mixed modelr

I have a linear mixed-effect model which I hope will answer the question of whether an increase in the frequency of use of one word leads to an increase of the frequency of use of that word by another person in a conversation, factoring out random effects of subject and topic of conversation. The basic model I've come up with looks like this:

Linear mixed model fit by REML 
Formula: relative.sents.A ~ relative.sents.B + (1 | id.A) + (1 | abstract) 
   Data: nonzero 
    AIC    BIC logLik deviance REMLdev
 -698.8 -683.9  354.4   -722.6  -708.8
Random effects:
 Groups   Name        Variance   Std.Dev. 
 id.A     (Intercept) 1.0790e-04 0.0103877
 abstract (Intercept) 3.0966e-05 0.0055647
 Residual             2.9675e-04 0.0172263
Number of obs: 146, groups: id.A, 97; abstract, 52

Fixed effects:
                 Estimate Std. Error t value
(Intercept)      0.017260   0.003046   5.667
relative.sents.B 0.428808   0.080050   5.357

Correlation of Fixed Effects:
            (Intr)
rltv.snts.B -0.742

The "dependent" variable is relative frequency of use by one person, and the fixed variable is relative frequency of use by another. I decided to see what the R^2 would be:

> cor(nonzero$relative.sents.A, fitted(nonzero.lmer))^2
[1] 0.6705905

To see what proportion of this is due to the fixed effect, I made a new model with only the random effects:

> summary(r.only.lmer)
Linear mixed model fit by REML 
Formula: relative.sents.A ~ 1 + (1 | id.A) + (1 | abstract) 
   Data: nonzero 
    AIC    BIC logLik deviance REMLdev
 -678.2 -666.3  343.1   -696.7  -686.2
Random effects:
 Groups   Name        Variance   Std.Dev. 
 id.A     (Intercept) 1.2868e-04 0.0113435
 abstract (Intercept) 7.8525e-06 0.0028022
 Residual             3.7643e-04 0.0194017
Number of obs: 146, groups: id.A, 97; abstract, 52

Fixed effects:
            Estimate Std. Error t value
(Intercept) 0.029149   0.002088   13.96

…and tried the same thing:

> cor(nonzero$relative.sents.A, fitted(r.only.lmer))^2
[1] 0.6882534

To my surprise, without that fixed effect, R^2 seems to increase!

Does this mean my model is useless? If so, any suggestions on what might be wrong? Or am I somehow misinterpreting these results?

Best Answer

You're misinterpreting these results, which is easy to do as with mixed models there's more than one type of 'fitted value' and the documentation of lmer isn't as clear as it might be. Try using fixed.effects() in place of fitted() and you should get correlations which makes more intuitive sense if you're interested in the contribution of the fixed effects.

The fitted() function of lmer is documented as giving the 'conditional means'. I had to check the Theory.pdf vignette to work out that these include the predictions of the modelled random effects. Your modelled random effect variances are, overall, smaller in the model including the fixed effect. But smaller random effects mean less shrinkage, i.e. the predicted random effect is closer to the observed residual. When calculating the correlation, it seems that in your case this smaller shrinkage just overcomes the improvement from the fixed effect.

The interpretation of $R^2$ as 'proportion of variance explained' gets more complex with mixed models, as it depends whether you think of random effects as 'explaining' variance. Probably not, in most cases.