Solved – Oddly large R squared values in meta regression (metafor)

meta-analysismeta-regressionr

I am using the metafor package in R. I have fit a random effects model with a continuous predictor as follows

SIZE=rma(yi=Ds,sei=SE,data=VPPOOLed,mods=~SIZE)

Which yields the output:

R^2 (amount of heterogeneity accounted for):            63.62%
Test of Moderators (coefficient(s) 2): 
QM(df = 1) = 9.3255, p-val = 0.0023

Model Results:

                 se    zval    pval   ci.lb   ci.ub    
intrcpt  0.3266  0.1030  3.1721  0.0015  0.1248  0.5285  **
SIZE     0.0481  0.0157  3.0538  0.0023  0.0172  0.0790  **

Below I have plotted the regression.The effect sizes are plotted proportionally to the inverse of the standard error. I realize that this is a subjective statement, but the R2 (63% variance explained) value seems a lot larger than is reflected by the modest relationship shown in the plot (even taking weights into account).

enter image description here

To show you what I mean, If I then do the same regression with the lm function (specifying study weights in the same way):

lmod=lm(Ds~SIZE,weights=1/SE,data=VPPOOLed)

Then the R2 drops to 28% variance explained. This seems closer to the way things are (or at least, my impression of what kind of R2 should correspond to the plot).

I realize, after having read this article (including the meta-regression section): (http://www.metafor-project.org/doku.php/tips:rma_vs_lm_and_lme), that differences in the way the lm and rma functions apply weights can influence the model coefficients. However, it is still unclear to me why the R2 values are so much larger in the case of meta-regression. Why does a model that looks to have a modest fit account for over half the heterogeneity in effects?

Is the larger R2 value because the variance is partitioned differently in the meta analytic case? (sampling variability v other sources) Specifically, does the R2 reflect the percent of heterogeneity accounted for within the portion that cant be attributed to sampling variability?. Perhaps there is a difference between "variance" in a non-meta-analytic regression and "heterogeneity" in a meta-analytic regression that I am not appreciating.

I'm afraid subjective statements like "It doesn't seem right" are all I have to go on here. Any help with interpreting R2 in the meta-regression case would be much appreciated.

Best Answer

The pseudo-$R^2$ value that is reported is computed with: $$R^2 = \frac{\hat{\tau}^2_{RE} - \hat{\tau}^2_{ME}}{\hat{\tau}^2_{RE}},$$ where $\hat{\tau}^2_{RE}$ is the (total) amount of heterogeneity as estimated based on a random-effects model and $\hat{\tau}^2_{ME}$ is the amount of (residual) heterogeneity as estimated based on the mixed-effects meta-regression model. Note that this isn't anything specific to the metafor package -- it's how this value is typically computed in mixed-effects meta-regression models.

This value estimates the amount of heterogeneity that is accounted for by the moderators/covariates included in the meta-regression model (i.e., it is the proportional reduction in the amount of heterogeneity after including moderators/covariates in the model). Note that it does not involve sampling variability at all. Hence, it is quite possible to get very large $R^2$ values, even when there are still discrepancies between the regression line and the observed effect sizes (when those discrepancies are not much larger than what one would expect based on sampling variability alone). In fact, when $\hat{\tau}^2_{ME} = 0$ (which can certainly happen), then $R^2 = 1$ -- but this doesn't imply that the points all fall on the regression line (the residuals are just not larger than expected based on sampling variability).

Regardless, it is important to realize that this pseudo-$R^2$ statistic is not very trustworthy unless the number of studies is large. See, for example, this article:

López-López, J. A., Marín-Martínez, F., Sánchez-Meca, J., Van den Noortgate, W., & Viechtbauer, W. (2014). Estimation of the predictive power of the model in mixed-effects meta-regression: A simulation study. British Journal of Mathematical and Statistical Psychology, 67(1), 30–48.

In essence, I wouldn't place too much trust in the actual value unless you have at least 30 studies (but don't quote me exactly on that figure). For a nice exercise, you could use bootstrapping to obtain an approximate CI for $R^2$. Pretty much all you need to know to do this is explained here:

http://www.metafor-project.org/doku.php/tips:bootstrapping_with_ma

Just change the value that is returned by the boot.func() function to res$R2 (and since there is no variance estimate for $R^2$, you cannot get the studentized intervals). In your case, you will probably end up with a very wide CI (possibly extending pretty much from 0 to 100%).

Related Solutions

Solved – Metafor package in R: meta regression and scatter plot

Assuming you are trying to model the relationship between year and the log odds of the outcome of interest using a logistic mixed-effects model, then yes, you used the right model.

You may want to rescale wi a bit. Such as:

wi <- 0.5 + 3.0 * (wi - min(wi))/(max(wi) - min(wi))

Something like this should do:

years <- 1998:2014
preds <- predict(model_A, transf = transf.ilogit, newmods = years)
plot(year, transf.ilogit(dat$yi), cex=wi)
lines(years, preds$pred)
lines(years, preds$ci.lb, lty="dashed")
lines(years, preds$ci.ub, lty="dashed")

Solved – Metafor package: Interpreting meta-regression model

Yes, based on what you have shown, I would say that the analysis is sensible. One concern might be the relatively large number of moderator variables (or more specifically, model coefficients) relative to the number of estimates. Right now, you have $105 / 14 = 7.5$ estimates per coefficient (not counting the intercept). Some might want that ratio to be closer to 10 or even 15, but some might also be okay with a ratio of 5. None of these are right or wrong, but the lower the ratio, the more concerned I would be with overfitting.
Indeed, strictly speaking, the PSS version factor fails to be significant at $\alpha = .05$. However, I think you can still discuss this factor -- cautiously. Based on psychometric theory and all else equal, it is to be expected that longer versions would lead to higher reliability, which is indeed what you find here (although the 14-item version does not seem to yield, on average, higher reliability than the 10-item version -- maybe those 4 extra items are not as internally consistent as the rest or maybe there is something else that is different about studies examining the 14-item version that is not captured by all the other moderator variables already included in the model).
It is common practice to examine one moderator at a time. In principle, this is poor practice, since moderator variables are often correlated. So, fitting a model including multiple moderators (as you have done) would be better, as that gets you closer to examining the contribution of a particular moderator variable while controlling for the rest. One reason why this is often not done is that the dataset typically looks like Swiss cheese, with lots of holes (i.e., missing data) in it. After listwise deletion, one then ends up with a (much) smaller dataset (i.e., only the studies with complete information on all moderator variables). Besides the loss of information itself, when this happens, a major concern here is potential bias due to the missingness. Hence, instead, analyses are often conducted one moderator at a time, so that all of the studies providing information on a particular moderator variable can be used. Bias due to missingness may still be an issue here, but maybe less so. But, as mentioned at the beginning, you are then not controlling for other moderator variables, so a "fake" moderator might appear to be relevant simply because it is correlated with a "true" moderator.

There are fancy techniques to deal with missingness (e.g., multiple imputation, full information maximum likelihood estimation), but these methods are poorly developed in the meta-analytic context. Alternatively, you could run the 'full model' analysis and the 'one at a time' analyses and put them side-by-side and hopefully you find some consistency in the conclusions. If so, the discussion section will be easy to write. If not, then good luck ;)

Best Answer

Related Solutions

Solved – Metafor package in R: meta regression and scatter plot

Solved – Metafor package: Interpreting meta-regression model

Related Question