Regression – Post Hoc Power Analysis for Multilevel Regression Models

lme4-nlmemultilevel-analysispost-hocregressionstatistical-power

I have a multilevel model with 2 levels (L1 = individuals, at least 710 per country; L2 = countries, 17 total)

mod <- lmer(Y ~ 
               pred1 * pred2 +  
               (1 + pred1 | cluster), data = d)

where
y = continuous outcome variable at L1
pred1 = categorical predictor with levels A and B at L1
pred2 = continuous predictor at L2
cluster = cluster variable with 17 groups.

Now, there is a significant pred1 fixed effect (p< .001), and I was asked to preform a post hoc power analysis arguing that 17 L2 groups do not yield acceptable power.

Questions:

  1. Does it ever make sense to do post hoc power analysis, for either significant or insignificant results? IMO, if the effects of interest are significant, then you are not worried about power (that is, you worry about it if you want to replicate your results). If the effects are insignificant, then estimating observed power is a direct function of the effect's p value (see http://daniellakens.blogspot.com/2014/12/observed-power-and-what-to-do-if-your.html)

  2. Some recent articles (e.g., Arend & Schafer, 2019) claim that one needs at least 30 clusters at L2 to have any notion of sufficient power for fixed effects (like in the example above), but this is unrealistic for some cases (like, when clusters are big cities in a small country, or even countries). Also, an answer by Robert Long (Minimum sample size per cluster in a random effect model) suggests that the number of clusters is more important for power than the number of units within clusters. I've seen many instances where 10 < n < 20 clusters were modeled within a multilevel model, not as fixed effects. And even if we model them as fixed effects, the problem of power (that is supposedly there) would not magically disappear.
    Can you clarify for me why 30 clusters are considered the minimum for sufficient power in multilevel models?

Literature

Arend, M. G., & Schäfer, T. (2019). Statistical power in two-level models: A tutorial based on Monte Carlo simulation. Psychological Methods, 24(1), 1–19. https://doi.org/10.1037/met0000195

Best Answer

Answering in reverse:

Question 2. First, the examples in the cited paper on power estimates are for large numbers of clusters, each of relatively small size. The plots of minimum detectable effect sizes (MDES) (Figures 1,2,3) only go up to 30 individuals per cluster and 150 clusters: at most 4500 individuals. You have a different situation: 17 clusters with at least 710 individuals each, over 12,000 individuals. That's more cases than any scenario I saw in a brief look at the paper. The total number of cases typically matters most.

Second, your model seems much less complex than the one illustrated in that paper. The model in the paper allows for a large number of potentially correlated random effects. For example, consider what would happen if your model included random effects for both your fixed-effect terms and for their interaction. The last model in the Mike Lawrence answer on the lmer cheat sheet is for such a model: it requires 14 coefficient estimates. In your case, the constant value of pred2 within each L2 level allows you to omit that as a random effect among clusters. You also omitted the pred1:pred2 interaction as a random effect,* further simplifying the model from what it might have been. I think your model only needs to estimate 6 coefficients.

Third, with only 17 clusters you will nevertheless have imprecise estimates of the variance and covariance of the random intercepts and slopes among clusters.

Question 1. Some recommendations to do "post-hoc power analysis" might not be so useless as they seem at first glance. An a priori power estimate to design a study for mixed-model analysis necessarily makes a lot of assumptions, as the linked paper explains. Re-examining those assumptions based on the observed data can provide both resolution of prior misconceptions and guidance for designing future studies. One might quibble over the terminology of "post-hoc power analysis." Whatever you call it, however, it's a good idea to evaluate what went right and what went wrong after you complete a study. For complicated mixed models, there is probably no better choice for that than simulations, as performed for example with the simr package used in the paper.


*I'm not sure about the wisdom of this omission, but I confess to having a lot of problems thinking about mixed models with interactions.

Related Question