Solved – Model selection: testing the need for random-effects terms in longitudinal data

anovalme4-nlmemixed modelrrepeated measures

I have longitudinal data set where each participant was observed for 12 weeks.
I followed this paper: Bliese, Paul D., and Robert E. Ployhart. "Growth modeling using random coefficient models: Model building, testing, and illustrations." Organizational Research Methods 5.4 (2002): 362-387.

First I fitted a generalized least squares model, which produces the following result:
model1 <- gls(X ~ group*time, data = dataFrame)

Coefficients:
                       Value   Std.Error   t-value p-value
(Intercept)        1.6933389 0.009814656 172.53167  0.0000
group0            -0.0586920 0.010610159  -5.53168  0.0000
time               0.0005821 0.000192112   3.02993  0.0024
group0:time       -0.0006525 0.000207683  -3.14177  0.0017 

Then I fitted a random-intercept model:

model2 <- lme(X ~ group*time, random = ~1|id, data = dataFrame)

Random effects:
 Formula: ~1 | id
        (Intercept)  Residual
StdDev:   0.2067486 0.2744509

Fixed effects: X ~ group * time 
                       Value   Std.Error    DF  t-value p-value
(Intercept)        1.6933389 0.023882981 44230 70.90149  0.0000
group0            -0.0586920 0.025818758   580 -2.27323  0.0234
time               0.0005821 0.000153538 44230  3.79115  0.0002
group0:time       -0.0006525 0.000165983 44230 -3.93109  0.0001

The fixed part is almost identical to model1, apart from the standard error associated with intercept and group0.

Then I did a likelihood ratio test in order to choose a model,; it shows that the two models are significantly different.

anova(model1, model2)

Model      df      AIC      BIC     logLik   Test  L.Ratio p-value
model1     1  5 31435.78 31479.33  -15712.890                        
model2     2  6 13555.15 13607.41  -6771.574 1 vs 2 17882.63  <.0001

I am a bit confused which model I should choose: if I consider standard errors they are a bit smaller in model1, but based on the likelihood ratio test should I choose the model with random intercepts?

–Updated–

model3 <- lme(X ~ group*time, random = ~time|id, data = dataFrame)

Random effects:
 Formula: ~time | id
 Structure: General positive-definite, Log-Cholesky parametrization
            StdDev      Corr  
(Intercept) 0.202541906 (Intr)
time        0.003067617 -0.317
Residual    0.265761977       

Fixed effects: X ~ group * time 
                       Value   Std.Error    DF  t-value p-value
(Intercept)         1.6933389 0.023368045 44230 72.46387  0.0000
group0             -0.0586920 0.025262085   580 -2.32333  0.0205
time                0.0005821 0.000366240 44230  1.58935  0.1120
group0:time        -0.0006525 0.000395925 44230 -1.64802  0.0994

anova(model1, model2, model3)

 Model     df      AIC      BIC     logLik   Test   L.Ratio p-value
model1     1  5 31435.78 31479.33 -15712.890                         
model2     2  6 13555.15 13607.41  -6771.574 1 vs 2 17882.633  <.0001
model3     3  8 11689.56 11759.24  -5836.779 2 vs 3  1869.588  <.0001

Since I am interested in seeing the growth of the group effect the slopes are no longer significant. Should I still choose model3?

Best Answer

The likelihood ratio test is slightly incorrect (in general, conservative) for testing the significance of a random effect, because the null value ($\sigma^2=0$) is at the boundary of the feasible space, but in this case there is overwhelmingly strong evidence against the null hypothesis. The model with random effects of individual is 15713-6772=8941 log-likelihood units better; twice the log-likelihood value is $\chi^2$ distributed, so the direct p-value calculation would give you ...

pchisq(2*8941,df=1,lower.tail=FALSE,log.p=TRUE)/log(10)
## -3885.251

... a p-value of approximately $10^{-3885}$.

You should really consider a random-slope model (random = ~time|id) as well.

Update: relative to the random-intercept model, the random-slopes model is again much better. The improvement is now 935 log-likelihood units, which doing the equivalent calculation as above corresponds to a rejection of the null hypothesis (among-individual variation in slope is equal to zero) with a p-value of "only" $10^{-408}$.

Related Question