Having read through a few posts, I still couldn't find an answer to my question.
I'm trying to investigate for the effect of covariate C on a longitudinal dataset. I have two linear mixed effect models given below:
A.lme <- lme( A ~ B + C, data = data1, random = ~ 1 | id)
B.lme <- lme( A ~ B*C, data = data1, random = ~ 1 | id)
I just want to be sure that I'm interpreting these two the right way. I believe in order to investigate for covariate C, I should be analysing B.lme
.
B represents time whilst A represents immune cells in the body whilst C represents a viral infection status.
The summary and anova for B.lme suggests that C has no significant effect on both intercept and slope as given below:
>summary(B.lme)
Linear mixed-effects model fit by REML
Data: data1
AIC BIC logLik
4238.806 4270.106 -2113.403
Random effects:
Formula: ~1 | id
(Intercept) Residual
StdDev: 0.9242001 0.9692625
Fixed effects: A ~ C + B + C:B
Value Std.Error DF t-value p-value
(Intercept) -3.0675750 0.6212136 1118 -4.938036 0.0000
C 0.7364624 0.6264595 244 1.175595 0.2409
B 0.2200117 0.1988966 1118 1.106161 0.2689
C:B 0.0131436 0.2000672 1118 0.065696 0.9476
Correlation:
(Intr) C B
C -0.992
B -0.849 0.842
C:B 0.844 -0.844 -0.994
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-8.51192452 -0.38169972 0.05365992 0.47695927 7.43457534
Number of Observations: 1366
Number of Groups: 246
anova(B.lme)
numDF denDF F-value p-value
(Intercept) 1 1118 811.5700 <.0001
C 1 244 3.7171 0.0550
B 1 1118 117.6260 <.0001
B:C 1 1118 0.0043 0.9476
When I had a closer look at A.lme, the summary/anova suggests that variable C is significant.
>summary(A.lme)
Linear mixed-effects model fit by REML
Data: data1
AIC BIC logLik
4235.429 4261.517 -2112.715
Random effects:
Formula: ~1 | id
(Intercept) Residual
StdDev: 0.9228998 0.9690801
Fixed effects: A ~ B + C
Value Std.Error DF t-value p-value
(Intercept) -3.1021904 0.3332309 1119 -9.309431 0.0000
B 0.2330059 0.0214786 1119 10.848303 0.0000
C 0.7713974 0.3352298 244 2.301100 0.0222
Correlation:
(Intr) B
B -0.171
C -0.971 0.034
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-8.51328019 -0.38179254 0.05385169 0.47724088 7.43658227
Number of Observations: 1366
Number of Groups: 246
anova(A.lme)
numDF denDF F-value p-value
(Intercept) 1 1119 813.4873 <.0001
B 1 1119 116.1162 <.0001
C 1 244 5.2951 0.0222
My question is which of the two models is more suitable for investigating C as a covariate? My second question is how important is the significance of the p-value of C in A.lme-this seems to suggest to me that C has a significant impact on the slope and intercept but not when combined with B (C:B). Can I safely conclude that C is not significant in B.lme? I'm using the nlme package in R.
Any help would be highly appreciated.
Best Answer
Your model
B.lme
shows that $C$ is not a significant predictor of the slope--this is what the interaction $B \cdot C$ tells you. In other words, the effect of $B$ on the dependent variable is constant across the values of $C$, or conversely, the effect of $C$ on the dependent variable is constant across the values of $B$. Because $B$ and $C$ are involved in an interaction inB.lme
, the "main effects" are actually not main effects, but merely the effect of the variable when the other variable is 0. For example, $0.736$ inB.lme
is the effect of $C$ on the dependent variable when $B=0$.For this reason, having determined that $C$ is not a significant predictor of the slope, it's a good idea to remove the non-significant interaction so that the effects are truly interpretable as main effects. The model
A.lme
shows that $C$ is a significant predictor of the dependent variable controlling for (holding constant) $B$, and similarly, that $B$ is a significant predictor of the DV controlling for $C$.