R Mixed Model – How to Model Fixed Effect Terms and Their Interactions?

lme4-nlmemixed modelr

I am moving from StackOverflow looking for statistical aspects of the code.

My work is based on an lifestyle intervention made up 3 groups (1, 2 and 3) with measurements at 2 points separated by 12 months (pre- and post-intervention). The sample size is around 50 participants/group, and the response variables are genes, just in case.It is proven which covariates have largely affect the outcome in this studies, such as age or sex.

My understanding of linear mixed models is limited, and constrained to their utility of the general aspects.

Referring to package choice, nlme package allows heterocedasticity, and more flexible covariance structures in the random effects compared to lme4. This is way I picked nlme over lme4.

Having decided this I could possibly have:

  1. lme(gene ~ time:group + age + sex, random= ~ time|id , method="REML", data=tmp, control = lmeControl(opt = "optim"), na.action = na.omit)
  2. lme(gene ~ time:group + group + age + sex, random= ~ time|id , method="REML", data=tmp, control = lmeControl(opt = "optim"), na.action = na.omit)
  3. lme(gene ~ time:group + group + time + age + sex, random= ~ time|id , method="REML", data=tmp, control = lmeControl(opt = "optim"), na.action = na.omit)
  4. lme(gene ~ time:group + time + age + sex, random= ~ time|id , method="REML", data=tmp, control = lmeControl(opt = "optim"), na.action = na.omit)

I don't really understand the difference in including the time effect alone or not. If understood it correctly the grup_int alone captures intra-individual variance (through the intercept). Not sure if time alone (model4) means anything or at least anything contributing the model. I checked model nÂș 2 and 3 don't change at least in the interaction terms.

Thanks in advance

UPDATE

The results obtained by the model2 for the time:grup_int interaction are like this:
enter image description here

Does it mean that the interaction term time:grup_int1 is not significant compared to the other 2 groups? How is this possible when time:grup_int2 is significant? I guess this is not the meaning becaus this would be a overall p-value as a simple ANOVA,but I would get a single value. Which is the meaning then?

Best Answer

lme4 is often a popular choice because it will warn you more when your models don't fit well. In your case, you have only two time points per individual, but you are fitting a model with both a random intercept and a random slope. You don't have enough data to fit a random slope. Your random effect term should instead be random= ~ 1|id. Now you are fitting an interaction term that is only between fixed effects (time:group). The standard rules for interactions apply, you need to include both time and group as separate variables (gene ~ time:group + group + time).

Edit

Here's a worked example.

> library(lme4)
> library(nlme)
> data(sleepstudy)
> 
> ## filter data so only 2 observations per participant
> head(sleepstudy[sleepstudy$Days<2,])
   Reaction Days Subject
1  249.5600    0     308
2  258.7047    1     308
11 222.7339    0     309
12 205.2658    1     309
21 199.0539    0     310
22 194.3322    1     310
> 
> ## nlme doesn't throw an error
> lme(fixed = Reaction ~ 1, 
+          random= ~ Days|Subject , 
+          method="REML", 
+          data=sleepstudy[sleepstudy$Days<2,], 
+          control = lmeControl(opt = "optim"), 
+          na.action = na.omit)
Linear mixed-effects model fit by REML
  Data: sleepstudy[sleepstudy$Days < 2, ] 
  Log-restricted-likelihood: -166.738
  Fixed: Reaction ~ 1 
(Intercept) 
   259.9841 

Random effects:
 Formula: ~Days | Subject
 Structure: General positive-definite, Log-Cholesky parametrization
            StdDev   Corr  
(Intercept) 30.23539 (Intr)
Days        18.71391 -0.23 
Residual    11.11483       

Number of Observations: 36
Number of Groups: 18 
> 
> ## lme4 correctly throws an error as the model is not identifiable
> lmer(Reaction ~ 1 + Days|Subject,
+           data = sleepstudy[sleepstudy$Days<2,])
Error: number of observations (=36) <= number of random effects (=36) for term (1 + Days | Subject); the random-effects parameters and the residual variance (or scale parameter) are probably unidentifiable