Solved – Interpreting effects of categorical and continuous predictors in multiple linear mixed models

categorical datacontinuous datamixed modelmultiple regressionregression coefficients

I'm building a LMM model with a continuous DV (signal amplitude) and two IV: one continuous (questionnaire score) and one categorical with 3 levels (condition). Conditions are: S, M and SM (the last being a combination of the first two). They are dummy coded and SM is set as the reference level. I use one random effect – a random intercept for subjects. Each subject encountered all the conditions.
This is the model in R's lme4:

m = lmer(DV ~ Q.score * cond + (1|subj), data = df, REML = FALSE)

(I read that with balanced designs and only one random effect I can use ML instead of REML.)

lmerTest::summary() gives these estimates of fixed effects:
(According to the documentation, the t- and p- values are calculated with Satterthwaite approximation for degrees of freedom)

    Fixed effects:
                    Estimate Std. Error         df t value    Pr(>|t|)    
(Intercept)        -0.941813   0.179603 103.900000  -5.244 0.000000832 ***
Q.score            -0.043102   0.024048 103.900000  -1.792       0.076 .  
condSMvsM           0.752574   0.179470 104.000000   4.193 0.000057923 ***
condSMvsS           0.385522   0.179470 104.000000   2.148       0.034 *  
Q.score:condSMvsM   0.005157   0.024030 104.000000   0.215       0.830    
Q.score:condSMvsS   0.024290   0.024030 104.000000   1.011       0.314 

Since cond is a multilevel categorical predictor, it's easier to interpret the results looking at the whole effect of the predictor. I know there are many ways of doing that so I tried a few, but I am surprised by the differences between them, especially in the p-values for the interaction (short comment: I know p-values aren't the miraculous answers to what is or is not true, but I simply have to get them, as my boss requires so, so please take a deep breath with me and be so kind to help anyway…).

1) Analysis of variance table of type II (or whatever you specify as the SS type) with Satterthwaite approximation for degrees of freedom – lmerTest::anova

              Sum Sq Mean Sq NumDF DenDF F.value    Pr(>F)    
Q.score       2.4049  2.4049     1    52  2.8717 0.0961287 .  
cond         14.7285  7.3643     2   104  8.7937 0.0002961 ***
Q.score:cond  0.9501  0.4750     2   104  0.5672 0.5688346 

2) Analysis of Deviance Table (Type II Wald chisquare tests) – car::Anova

               Chisq Df Pr(>Chisq)    
Q.score       2.8717  1  0.0901471 .  
cond         17.5875  2  0.0001517 ***
Q.score:cond  1.1345  2  0.5670901 

3) The likelihood ratio test – with lmer models

> m0 = lmer(DV ~ 1 + (1|subj), data=df, REML = FALSE)
> m1 = lmer(DV ~ Q.score + (1|subj), data=df, REML = FALSE)
> m2 = lmer(DV ~ cond + (1|subj), data=df, REML = FALSE)
> m3 = lmer(DV ~ Q.score * cond + (1|subj), data=df, REML = FALSE)
> anova(m0,m1,m2,m3)
Data: df
Models:
object: DV ~ 1 + (1 | subj)
..1: DV ~ Q.score + (1 | subj)
..2: DV ~ cond + (1 | subj)
..3: DV ~ Q.score * cond + (1 | subj)
       Df    AIC    BIC  logLik deviance   Chisq Chi Df Pr(>Chisq)    
object  3 513.25 522.40 -253.62   507.25                              
..1     4 512.45 524.65 -252.23   504.45  2.7953      1  0.0945439 .  
..2     5 501.16 516.41 -245.58   491.16 13.2916      1  0.0002666 ***
..3     8 503.24 527.64 -243.62   487.24  3.9236      3  0.2698349    

And here is my first question:
(1) Which test do I use to report an effect fo a categorical predictor? Should it be F-value from analysis of variance? Or chi square statistics (if so, from the analysis of deviance or from the likelihood ratio test)?

The effects don't differ very much in my example, but I am still facing another problem:

(2) How to describe the direction of an effect of a multilevel categorical predictor?

I have seen publications in which authors report coefficients, t-values and SD of fixed effects. If I understand correctly, I could do the same from the summary() output (see the first piece of code). However, since I can only report pairwise comparisons between levels, I would have to change contrasts to address the pair of levels which are not the reference level in my first model – here: S vs. M. (I don't have an easy way of deciding on the contrasts (like clinical group vs. control group), but three arbitrary conditions).
For example, this is summary for the same model with condition M as the reference value (as compared to SM above):

Fixed effects:
                    Estimate Std. Error         df t value  Pr(>|t|)    
(Intercept)        -0.189239   0.179603 103.900000  -1.054    0.2945    
Q.score            -0.037945   0.024048 103.900000  -1.578    0.1176    
condMvsS           -0.367052   0.179470 104.000000  -2.045    0.0434 *  
condMvsSM          -0.752574   0.179470 104.000000  -4.193 0.0000579 ***
Q.score:condMvsS    0.019133   0.024030 104.000000   0.796    0.4277    
Q.score:condMvsSM  -0.005157   0.024030 104.000000  -0.215    0.8305 

And here I have two problems:

(3) Why does the continuous predictor's coefficient and t value change with different contrasts of the categorical predictor?
The Q.score t and p values changed. Of course the overall effect of this predictor stays the same (as found by one of the above methods). Does is mean that the value of Q.score decreases more when "going" from the M than SM condition?
Should I then only report the general effect of this predictor (like in the above methods)?

(4) Shouldn't I apply some kind of correction for multiple comparisons if I change the contrasts to explore all pairwise comparisons? Can I do it at all?…

And one more general question in the end:

(5) How to acquire a general F statistic of the whole model? Or is R^2 and adj. R^2 enough?

I know it's a lot for one post, but I really read everything I could find and I'm still stuck. I would really, really appreciate your help with any of these questions!

Best Answer

Since @mhoven asked, here are the answers to my own old questions: I ended up following those (regarding the questions in the post):

  1. Now I'm using the anova from lmerTest to describe the main effect of a multilevel categorical predictor;
  2. To describe a direction, I change the contrasts. Then I do a correction for multiple comparisons.
  3. Well, the estimate of the continues predictor changes with different contrasts of the categorical predictor, because it's estimated for the specific conditions, e.g., S-M, S-SM, M-SM. Of course it's different, as the model describes the effect of the continuous predictor for a certain pairwise comparison of the categorical levels.
  4. Yes, as in 2. I normally just use Bonferonni.
  5. I now report adjusted and marginal R^2 for the model. Hope that helps!