Posthoc Analysis LMM – Interaction with a Factor Containing 3 Levels

interactionlme4-nlmelsmeansr

I am fitting a Linear Mixed Model using the lmerTest package. I have two fixed effects: group (categorical, 2 levels) and period (categorical, 3 levels). I have an interaction term between the two fixed effects period*group The response variable is continuous. I have one random effect (id). Please see the data below:

> moddf <- dput(mydata)
structure(list(id = c("id3", "id5", "id3", "id5", "id3", "id5", 
"id1", "id2", "id4", "id6", "id7", "id8", "new", "id1", "id2", 
"id4", "id6", "id7", "id8", "new", "id1", "id2", "id4", "id6", 
"id7", "id8", "new"), area = c(28.28, 11.04, 31, 10.28, 18.4, 
18.12, 17.56, 18.28, 13.04, 14.24, 21.44, 26.48, 22.56, 7.92, 
16.28, 8.52, 16.8, 7.36, 11.56, 17.64, 4.56, 5.04, 8.16, 6.2, 
3.24, 5.28, 6.6), period = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L), levels = c("night", "outside", "within"), class = "factor"), 
    group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L), levels = c("groupN", "groupP"), class = "factor")), row.names = c(NA, 
-27L), class = "data.frame")

I specified the following model:

lmR_g <- lmerTest::lmer(area ~ group + period + period*group + (1|id), data = moddf, REML = F)

The summary shows that the interaction is significant. And visual representation asks for a post hoc analysis as the effects of period clearly differs in the two groups (see plot). While there are different options described online, I am unsure which one is most appropriate. I have used lsmeans so far to test for each period at the levels of the group (see below). However, any advice regarding post hoc options for this situation as well as advice if the $lsmeans or $contrasts should be used if lsmeans() is applied would be highly appreciated.

lsmeans(lmR_g, pairwise ~ period | group)

$lsmeans
group = groupN:
 period  lsmean   SE   df lower.CL upper.CL
 night    19.66 3.89 29.9    11.71    27.61
 outside  20.64 3.89 29.9    12.69    28.59
 within   18.26 3.89 29.9    10.31    26.21

group = groupP:
 period  lsmean   SE   df lower.CL upper.CL
 night    19.09 2.08 29.9    14.83    23.34
 outside  12.30 2.08 29.9     8.05    16.55
 within    5.58 2.08 29.9     1.33     9.83

Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 

$contrasts
group = groupN:
 contrast         estimate   SE   df t.ratio p.value
 night - outside     -0.98 4.65 23.1  -0.211  0.9759
 night - within       1.40 4.65 23.1   0.301  0.9515
 outside - within     2.38 4.65 23.1   0.511  0.8666

group = groupP:
 contrast         estimate   SE   df t.ratio p.value
 night - outside      6.79 2.49 23.1   2.729  0.0309
 night - within      13.50 2.49 23.1   5.427  <.0001
 outside - within     6.71 2.49 23.1   2.699  0.0329

Degrees-of-freedom method: kenward-roger 
P value adjustment: tukey method for comparing a family of 3 estimates 

Best Answer

First, it's worth considering whether your interaction truly is significant. Looking at the summary() of your model:

> summary(lmR_g)
Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's method [lmerModLmerTest]
Formula: area ~ group + period + period * group + (1 | id)
   Data: moddf

     AIC      BIC   logLik deviance df.resid 
   176.0    186.3    -80.0    160.0       19 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.6776 -0.6796 -0.1875  0.8329  1.6776 

Random effects:
 Groups   Name        Variance Std.Dev.
 id       (Intercept)  6.729   2.594   
 Residual             16.849   4.105   
Number of obs: 27, groups:  id, 9

Fixed effects:
                          Estimate Std. Error       df t value Pr(>|t|)    
(Intercept)                19.6600     3.4335  23.2179   5.726 7.58e-06 ***
groupgroupP                -0.5743     3.8933  23.2179  -0.148   0.8840    
periodoutside               0.9800     4.1048  18.0000   0.239   0.8140    
periodwithin               -1.4000     4.1048  18.0000  -0.341   0.7370    
groupgroupP:periodoutside  -7.7686     4.6544  18.0000  -1.669   0.1124    
groupgroupP:periodwithin  -12.1029     4.6544  18.0000  -2.600   0.0181 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
             (Intr) grpgrP prdtsd prdwth grpgrpP:prdt
groupgroupP  -0.882                                  
periodoutsd  -0.598  0.527                           
periodwithn  -0.598  0.527  0.500                    
grpgrpP:prdt  0.527 -0.598 -0.882 -0.441             
grpgrpP:prdw  0.527 -0.598 -0.441 -0.882  0.500   

You can see that your model adds two interaction terms, one of which is p<0.05. To test whether the interaction as a whole is significant, you can consider adopting more of an ANOVA framework and testing whether the addition of the interaction term improves model fit:

> lmR_g_null = lmerTest::lmer(area ~ group + period  + (1|id), data = moddf, REML = F)
> anova(lmR_g_null,lmR_g)
Data: moddf
Models:
lmR_g_null: area ~ group + period + (1 | id)
lmR_g: area ~ group + period + period * group + (1 | id)
           npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)  
lmR_g_null    6 177.84 185.61 -82.920   165.84                       
lmR_g         8 175.97 186.33 -79.984   159.97 5.8721  2    0.05308 .
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The chisquared test of the log-likelihood ratio isn't quite significant, and while the AIC is smaller in the full model, the BIC is not. So you may not have an interaction here.

Even so, if you did have an interaction, the way your lsmeans() call is structured only indirectly tells you about it's source. In your example the results reflect the effect of effect of period within each group, instead you want the effect of group at each level of period:

> lsmeans(lmR_g, pairwise ~ group | period)
$lsmeans
period = night:
 group  lsmean   SE   df lower.CL upper.CL
 groupN  19.66 3.89 29.9    11.71    27.61
 groupP  19.09 2.08 29.9    14.83    23.34

period = outside:
 group  lsmean   SE   df lower.CL upper.CL
 groupN  20.64 3.89 29.9    12.69    28.59
 groupP  12.30 2.08 29.9     8.05    16.55

period = within:
 group  lsmean   SE   df lower.CL upper.CL
 groupN  18.26 3.89 29.9    10.31    26.21
 groupP   5.58 2.08 29.9     1.33     9.83

Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 

$contrasts
period = night:
 contrast        estimate   SE   df t.ratio p.value
 groupN - groupP    0.574 4.41 29.9   0.130  0.8974

period = outside:
 contrast        estimate   SE   df t.ratio p.value
 groupN - groupP    8.343 4.41 29.9   1.890  0.0685

period = within:
 contrast        estimate   SE   df t.ratio p.value
 groupN - groupP   12.677 4.41 29.9   2.872  0.0074

Degrees-of-freedom method: kenward-roger 

If you did have an interaction, it would be due to an effect of group only when period = within.