Solved – Partitioning explained variance to fixed effects by comparing r squared (R2) between linear mixed models

mixed modelmodelingr-squared

Lets say I have 2 linear mixed models. One is simply a subset of the other.

The first contains terms for 2 fixed effects and a random intercept. One of the fixed effects, "x1" I know, a priori, belongs in the model. The other fixed effect, "x2", I suspect may explain additional variance in the response (and I'd like to figure out how much).

The second model is the same as the first, except it simply drops "x2".

I calculate marginal r-squared for both models sensu
"Nakagawa, S., and H. Schielzeth. 2013. A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods in Ecology and Evolution 4(2): 133-142"

According to the paper, marginal r-squared gives the variance explained by the fixed effects part of the model.

My question is, why or why not can I take the difference in marginal r-squared values between the 2 models as the variance explained by "x2"?

(e.g., R2_model1 = 0.5, R2_model2 = 0.4; Is the interpretation as simple as "including x2 explains 10% more variance"?) I understand that "x2" in a model without "x1" would not necessarily have r2 = 0.1

Thanks for any help and please let me know if I can improve this question.

Best Answer

Your model is as follows:

$$y_{ij} = \beta_0 + \beta_1 X^{(1)}_{ij} + \beta_2 X^{(2)}_{ij} + b_j + \epsilon_{ij}$$

and $R^2$ for fixed effects ($R^2_m$) is defined by Nakagawa and Schielzeth as:

$$R^2_m = \frac{\sigma^2_f}{\sigma^2_f + \sigma^2_\alpha + \sigma^2_\epsilon}$$

where:

$$\sigma^2_f = var\left( \beta_1 X^{(1)}_{ij} + \beta_2 X^{(2)}_{ij} \right)$$

So, when in the second model you drop $X_2$, what changes is that $\sigma^2_f = var( \beta_1 X^{(1)}_{ij} )$. You can check this thread on how does variance of sum relates to sum of variances because it relates to the formula. What you can find is that, generally, variance of sum is the same as sum of variances if the variables are independent. So if $X_1$ and $X_2$ are uncorrelated you can decompose it in:

$$\frac{var( \beta_1 X^{(1)}_{ij})}{var( \beta_1 X^{(1)}_{ij} + \beta_2 X^{(2)}_{ij}) + \sigma^2_\alpha + \sigma^2_\epsilon} + \frac{var( \beta_2 X^{(2)}_{ij})}{var( \beta_1 X^{(1)}_{ij} + \beta_2 X^{(2)}_{ij}) + \sigma^2_\alpha + \sigma^2_\epsilon}$$

and so you can consider both sides of summation as "$R^2_{\beta_1}$" and "$R^2_{\beta_2}$", i.e. variance explained by both terms in this model. In Nakagawa and Schielzeth's approach $R^2$'s are proportions of variance explained by one part of the model in all the variance explained by all the effects in the model (fixed + random + residual). So it is more relative than with classical $R^2$ that is:

$$R^2 = 1 - \frac{\sum_i(y_i - \hat{y})^2}{\sum_i(y_i - \overline{y})^2}$$

so here, in classical $R^2$, if you compared different models, the divisor would not change. With $R^2$'s for linear mixed models, the divisor changes. Also, in classical $R^2$ you compare model variance to variance of the independent variable, while in $R^2$'s for linear mixed models you compare to all the variance explained by all the effects in the model, so it is more relative. That is also a problem with using it in a way you would like to use it: looking at differences between different $R^2_m$'s. I personally like Nakagawa and Schielzeth approach, however you have to remember that while the idea is it is an "absolute" measure, in practice it is not that "absolute" as classical $R^2$ for linear regression. I would treat it as another measure of fit, but certainly not as "more objective" than others. Also, consider that in general $R^2$ for linear models has it's flaws (e.g. this question).

If you wanted to look at variance explained by some independent variables alone I would rather use "relative" approach, i.e. "$R^2_{\beta_1}$" and "$R^2_{\beta_2}$". However, $R^2$ is used as a measure of fit for a whole model and I haven’t seen "partial $R^2$" used by anyone.

The commonly used approach to compare "how important" some fixed coefficient is, would be to check "how big" it is relatively to the scale of your dependent variable (does including it change anything in the result you could obtain by the model) and/or to check the size of standardized coefficients (i.e. $t$-values). Gelman & Hill (2006) give many practical hints on regression models and mixed models, so you could check out this book.