Solved – Partitioning variance within a level in a 3-level mixed effects model

intraclass-correlationlme4-nlmemixed modelrvariance

I have a 3-level repeated measures model setup observations (level-1) nested within patients (level-2) nested within providers (level-3). One of my questions is the percentage of variance explained at level-2 versus level-3 across the dependent variable (outcome). I'm running these with the intra-class correlation coefficient ICC (p).

I am trying to fit a model where I partition variance within my level-2 and level-3 random effects based on client's racial status (white/non-white). I can subset my data and run three separate models and measure the ICCs individually, but I am hoping to do this in one model, which will also allow me to test significance between variance estimates by leaving one out and running log-likelihood tests, e.g. anova(model1,model2).

Here is my baseline (random intercept) model for calculating ICC on the complete sample: (note there is no cross-nesting and each patient has their own provider, no two providers see the same patient)

model1 <- lmer(outcome ~ (1|id) + (1|provider), data = dat)

Rather than subset the data and run three models. I want to run one model and measure both client and therapist variability in the outcome measure with white versus non-white patients. I ran this model:

model2 <- lmer(outcome ~ rem + (1|id/rem) + (1|provider/rem), data = dat)

Note: REM (racial-ethnic minority) is coded as a factor, "White" and "REM". I have about 1000 White patients, 500 REM patients, and 39 providers (each provider has seen at least 5 patients).

My understanding is that without the (1|…) in both random effects the model will estimate covariances, which I do not want as I will not be able to estimate ICCs.

Model 2 output is as follows:

    REML criterion at convergence: 13694.2

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-10.2143  -0.3806   0.0751   0.4585   5.6865 

Random effects:
 Groups          Name        Variance Std.Dev.
 rem:id          (Intercept) 0.21617  0.46494 
 id              (Intercept) 0.36515  0.60428 
 rem:provider    (Intercept) 0.05808  0.24100 
 provider        (Intercept) 0.00515  0.07176 
 Residual                    0.17966  0.42386 
Number of obs: 8606, groups:  rem:id, 1504; id, 1504; rem:provider, 78; provider, 39

Fixed effects:
              Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)  5.9928132  0.0483799 66.1700000 123.870   <2e-16 ***
remrem      -0.0005091  0.0740942 52.5400000  -0.007    0.995    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
       (Intr)
remrem -0.615

Central Question:

  • Is this the right approach to meet my aim of partitioning variance within the two levels of REM status (White/REM) and across the two levels of analysis (patient & provider)? In the end, I am hoping to be able to calculate variance explained across all 3 levels (patient, provider residual) by 3 samples (total sample, white, rem) for a total of 9 ICCs.

I struggle to interpret the Groups under Random effects… e.g., rem:id versus id … It looks like there are two intercepts for my level-1 variables, but I do not know if rem:id is across rem status and id is across all patients. I am looking for something like white:id and rem:id to estimate my ICCs.

Secondary Model Results Questions:

  • The number of observations looks mostly right, however, for rem:provider the number is double the number of providers in my sample. And rem:id is equal to total number of patients as well as id. In my sample White is about 1000 and REM is about 500. Is this indicative of an error in my model, or is it not assessing these two groups separately?
  • My residual variance across both models looks to be forced into one variance estimate. Is there a way to partition this variance across level-1 and level-2? This will be critical in calculating ICCs.
  • Do I need to include rem as a fixed effect as well? How does including versus not including impact variance estimation?

The best response I have found in answering how to specify my random effects is Ben Boker's response here but I am still having troubles interpreting model results and pulling variance estimates to calculate ICCs.

Best Answer

I figured out a workaround, although the residual variance is not parsable in lme4 (as far as I know). One workaround is to create separate numeric dummy variables for white and rem. I made these white_dummy = 1 and another column rem_dummy = 1. Then I ran the following model:

model3 <- lmer(outcome ~ rem + 
 (0 + white_dummy|id) +
 (0 + rem_dummy|id) +
 (0 + white_dummy|provider) +
 (0 + rem_dummy|provider), 
 data = filter(dat, rem == 1 | rem == 0))

and this creates the following random effects output:

Random effects:
 Groups        Name         Variance Std.Dev. Corr
 id            white_dummy0 0.19631  0.4431       
               white_dummy1 0.39760  0.6306   0.06
 id.1          rem_dummy0   0.18381  0.4287       
               rem_dummy1   0.39400  0.6277   0.24
 provider_1    white_dummy0 0.06192  0.2488       
               white_dummy1 0.01334  0.1155   0.11
 provider_1.1  rem_dummy0   0.02774  0.1666       
               rem_dummy1   0.03867  0.1966   0.00
Residual                    0.17943  0.4236       
Number of obs: 8606, groups:  id, 1504; therapist_1, 39

However! The model failed to converge and this is likely due to the reduced number of observations at the therapist/rem level and the variance estimates are near zero. I am going to go the bayesian MCMCglmm route to attempt to fit this model.