I have a linear mixed-effect model which I hope will answer the question of whether an increase in the frequency of use of one word leads to an increase of the frequency of use of that word by another person in a conversation, factoring out random effects of subject and topic of conversation. The basic model I've come up with looks like this:
Linear mixed model fit by REML
Formula: relative.sents.A ~ relative.sents.B + (1 | id.A) + (1 | abstract)
Data: nonzero
AIC BIC logLik deviance REMLdev
-698.8 -683.9 354.4 -722.6 -708.8
Random effects:
Groups Name Variance Std.Dev.
id.A (Intercept) 1.0790e-04 0.0103877
abstract (Intercept) 3.0966e-05 0.0055647
Residual 2.9675e-04 0.0172263
Number of obs: 146, groups: id.A, 97; abstract, 52
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.017260 0.003046 5.667
relative.sents.B 0.428808 0.080050 5.357
Correlation of Fixed Effects:
(Intr)
rltv.snts.B -0.742
The "dependent" variable is relative frequency of use by one person, and the fixed variable is relative frequency of use by another. I decided to see what the R^2 would be:
> cor(nonzero$relative.sents.A, fitted(nonzero.lmer))^2
[1] 0.6705905
To see what proportion of this is due to the fixed effect, I made a new model with only the random effects:
> summary(r.only.lmer)
Linear mixed model fit by REML
Formula: relative.sents.A ~ 1 + (1 | id.A) + (1 | abstract)
Data: nonzero
AIC BIC logLik deviance REMLdev
-678.2 -666.3 343.1 -696.7 -686.2
Random effects:
Groups Name Variance Std.Dev.
id.A (Intercept) 1.2868e-04 0.0113435
abstract (Intercept) 7.8525e-06 0.0028022
Residual 3.7643e-04 0.0194017
Number of obs: 146, groups: id.A, 97; abstract, 52
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.029149 0.002088 13.96
…and tried the same thing:
> cor(nonzero$relative.sents.A, fitted(r.only.lmer))^2
[1] 0.6882534
To my surprise, without that fixed effect, R^2 seems to increase!
Does this mean my model is useless? If so, any suggestions on what might be wrong? Or am I somehow misinterpreting these results?
Best Answer
You're misinterpreting these results, which is easy to do as with mixed models there's more than one type of 'fitted value' and the documentation of
lmer
isn't as clear as it might be. Try usingfixed.effects()
in place offitted()
and you should get correlations which makes more intuitive sense if you're interested in the contribution of the fixed effects.The
fitted()
function oflmer
is documented as giving the 'conditional means'. I had to check the Theory.pdf vignette to work out that these include the predictions of the modelled random effects. Your modelled random effect variances are, overall, smaller in the model including the fixed effect. But smaller random effects mean less shrinkage, i.e. the predicted random effect is closer to the observed residual. When calculating the correlation, it seems that in your case this smaller shrinkage just overcomes the improvement from the fixed effect.The interpretation of $R^2$ as 'proportion of variance explained' gets more complex with mixed models, as it depends whether you think of random effects as 'explaining' variance. Probably not, in most cases.