Solved – Metafor rma.mv function: missing estimates for two levels of a categorical moderator

meta-analysismultilevel-analysisr

I am conducting a meta-analysis with several categorical moderators and one continuous moderator; I am also interested in an interaction between two of the moderators (outcome and scale).

I have used the following code to do this:

Meta1<-rma.mv(yi=G, V=VG, mods = ~ Outcome * Scale + Scale_Type + Profac_def + Gender + Bias, random = list(~ 1 | Author_Num, ~ 1 | Study.in.Author, ~ 1 | Scale.in.Study, ~ 1 | Outcome.in.Study))

Followed by removing the intercept to examine estimates for the level of each factor (and the scale*outcome combinations) with the following:

Meta1.1<-rma.mv(yi=G, V=VG, mods = ~ Outcome * Scale + Scale_Type + Profac_def + Gender + Bias - 1, random = list(~ 1 | Author_Num, ~ 1 | Study.in.Author, ~ 1 | Scale.in.Study, ~ 1 | Outcome.in.Study))

Both outcome and scale have a lot of levels (12 and 19, respectively). However, the output is not showing all levels of the scale factor, or all of the outcome*scale combinations.

In the model with the intercept, one level of the scale moderator is missing, and a large number of combinations. In the model without the intercept, two levels of scale are missing (including the reference category), and again a large number of combinations.

I am very new to R and have no idea what is causing this. If anyone has any solutions and can advise how I can get around this it would be greatly appreciated

Best Answer

If you examine the output carefully, you should have gotten the following warning when fitting your models:

Warning message:
In rma.mv(...) :
  Redundant predictors dropped from the model.

The problem is that the model matrix (which is formed based on the moderators that you include in the model) is not of full rank. Put another way: There is not sufficient data available to estimate each of the coefficients in the model. As a result, the function has dropped moderators/predictors from the model that are redundant (so that the reduced model matrix is of full rank).

You note that Outcome and Scale have 12 and 19 levels, respectively. That alone means that a total of $12\times19 = 228$ coefficients would be need to be estimated, plus additional coefficients for the other moderators. Unless you have thousands of data points (and each combination of Outcome and Scale actually occurs in your dataset), this isn't going to work.

You need to consider simplifying your model, for example by collapsing some of the levels of the Outcome and Scale variables into a smaller number of levels. Even then, you need to check that each combination of levels actually occurs. You can easily check this by examining a contingency table of those factors:

table(Outcome, Scale)

You may also need to reconsider whether it is even possible/realistic to estimate the interaction between those two factors. Maybe you need to just stick to a model with main effects only.

Related Question