Solved – Intercepts (reference) in linear mixed effect model, what to choose

interceptinterpretationlme4-nlme

I am working on a study that examines word processing with native and non-native speakers. We have three independent variables: Groups (NSs and NNS), word types (five conditions), and word relatedness (related and unrelated words), and one dependent variable; Reaction Time.
I have successfully ran linear mixed effects model in R using (lme4) package and I was able to understand the output. However, there is something that I do not understand, the intercepts (reference level). Could someone explain what does the intercept or reference means? Does it mean all levels of the variables are compared to it? If so, should I adjust the intercept level or let lme4 uses the default?
I am struggling to make this decision and I need reasons to back up any choice I make.

Note that that the Word Types are not related to each other, but they are compared with regard to Word Relatedness. Interestingly, when I change the intercept level the whole output changes. If you see below the Related:WordType also changes when the intercept is changed — shouldn't this be the same in the two outputs?

> contrasts(NS$WordType)     # default intercept by R
 2 3 4 5
 1 0 0 0 0
 2 1 0 0 0
 3 0 1 0 0
 4 0 0 1 0
 5 0 0 0 1

Linear mixed model fit by REML ['lmerMod']
Formula: RT ~ Related * WordType * (1 | Item) + (1 | Subject) 
   Data: NS 

REML criterion at convergence: 59570.13 

Random effects:
 Groups   Name        Variance Std.Dev.
 Item     (Intercept)  1099     33.15  
 Subject  (Intercept)  4002     63.26  
 Residual             16153    127.10  
Number of obs: 4737, groups: Item, 120; Subject, 43

Fixed effects:
                    Estimate Std. Error t value
(Intercept)          681.373     13.179   51.70
Related2              13.655      8.379    1.63
WordType2             1.220     12.729    0.10
WordType3             5.700     12.709    0.45
WordType4             5.199     12.696    0.41
WordType5           -48.050     12.690   -3.79
Related2:WordType2  -20.925     11.862   -1.76
Related2:WordType3   -6.780     11.869   -0.57
Related2:WordType4  -19.746     11.854   -1.67
Related2:WordType5   28.870     11.859    2.43

NS$WordType <- relevel(NS$WordType,"4")     # Changed the intercept to word type 4
contrasts(NS$PrimeType)
  1 2 3 5
4 0 0 0 0
1 1 0 0 0
2 0 1 0 0
3 0 0 1 0
5 0 0 0 1

Linear mixed model fit by REML ['lmerMod']
Formula: RT ~ Related * WordType * (1 | Item) + (1 | Subject) 
   Data: NS

REML criterion at convergence: 59570.13 

Random effects:
 Groups   Name        Variance Std.Dev.
 Item     (Intercept)  1099     33.15  
 Subject  (Intercept)  4002     63.26  
 Residual             16153    127.10  
Number of obs: 4737, groups: Item, 120; Subject, 43

Fixed effects:
                    Estimate Std. Error t value
(Intercept)         686.5723    13.1779   52.10
Related2             -6.0911     8.3859   -0.73
WordType1           -5.1994    12.6957   -0.41
WordType2           -3.9794    12.7283   -0.31
WordType3            0.5006    12.7083    0.04
WordType5          -53.2497    12.6895   -4.20
Related2:WordType1  19.7457    11.8541    1.67
Related2:WordType2  -1.1791    11.8656   -0.10
Related2:WordType3  12.9659    11.8734    1.09
Related2:WordType5  48.6158    11.8662    4.10

I have also came across afex package — first intended to calculate p-values. I noticed it controlled and chooses a different default when running the model.

> contrasts(NS$WordType)     # default intercept by afex
  [,1] [,2] [,3] [,4]
1    1    0    0    0
2    0    1    0    0
3    0    0    1    0
4    0    0    0    1
5   -1   -1   -1   -1


Linear mixed model fit by REML ['lmerMod']
Formula: RT ~ Related * WordType * (1 | Item) + (1 | Subject) 
   Data: NS

REML criterion at convergence: 59583.5 

Random effects:
 Groups   Name        Variance Std.Dev.
 Item     (Intercept)  1099     33.15  
 Subject  (Intercept)  4002     63.26  
 Residual             16153    127.10  
Number of obs: 4737, groups: Item, 120; Subject, 43

Fixed effects:
                    Estimate Std. Error t value
(Intercept)         679.1560    10.2793   66.07
Related1             -4.9693     1.8766   -2.65
WordType1            9.0442     7.0943    1.27
WordType2           -0.1982     7.0930   -0.03
WordType3           11.3543     7.1003    1.60
WordType4            4.3708     7.0972    0.62
Related1:WordType1  -1.8580     3.7485   -0.50
Related1:WordType2   8.6044     3.7544    2.29
Related1:WordType3   1.5319     3.7579    0.41
Related1:WordType4   8.0148     3.7509    2.14

When I saw the negative numbers in the contrasts, I wonder if I could produce the same output as afex when I adjust the intercepts.

> NS$WordType <- relevel(NS$WordType,"5")   # changed the intercept to word type 5
> NS$Related <- relevel(NS$Related, "2")          # changed word relatedness to 2 (unrelated)
> contrasts(NS$WordType)                               # both adjusted to mimic afex output
  1 2 3 4
5 0 0 0 0
1 1 0 0 0
2 0 1 0 0
3 0 0 1 0
4 0 0 0 1

Linear mixed model fit by REML ['lmerMod']
Formula: RT ~ Related * WordType * (1 | Item) + (1 | Subject) 
   Data: NS

REML criterion at convergence: 59570.13 

Random effects:
 Groups   Name        Variance Std.Dev.
 Item     (Intercept)  1099     33.15  
 Subject  (Intercept)  4002     63.26  
 Residual             16153    127.10  
Number of obs: 4737, groups: Item, 120; Subject, 43

Fixed effects:
                    Estimate Std. Error t value
(Intercept)         675.8473    13.1857   51.26
Related1            -42.5247     8.3944   -5.07
WordType1           19.1802    12.6920    1.51
WordType2           -0.5246    12.6622   -0.04
WordType3           18.1004    12.7000    1.43
WordType4            4.6339    12.7001    0.36
Related1:WordType1  28.8701    11.8595    2.43
Related1:WordType2  49.7949    11.8727    4.19
Related1:WordType3  35.6499    11.8805    3.00
Related1:WordType4  48.6158    11.8662    4.10

As you can see, the output is different from afex output.

To summarize, my questions are as follows:
What is the intercept in Linear mixed effects models?
What do I need to know in order to make a wise decision in adjusting the intercept?
Why afex package differ in choosing the intercept? Would it be the "best" approach in choosing the intercept for this type of design?

Best Answer

The intercept is the predicted value of the dependent variable when all the independent variables are 0. Since all your IVs are categorical, the meaning of an IV being 0 depends entirely on the coding of the variable, and the default is not necessarily going to be the most useful. You can figure out what each program uses as a default from the documentation.

The reasons to select a particular reference level could be substantive (e.g. one group is likely to be seen as the "default") or statistical (e.g. one reasonable choice is to use the level with the lowest (or highest) level of the predicted value, to make all the comparisons to that level.

When a categorical variable has only two levels, changing the default will simply flip the sign (e.g. if boys are 5 inches taller than girls, then girls are 5 inches shorter than boys) but when there are more levels the changes are not as intuitively obvious.