Solved – What test can I use to compare intercepts from two or more regression models when slopes might differ

interceptrregression

I wish to test whether intercepts in linear regression models differ between two or more groups, when group-specific slopes might themselves differ (i.e., an interaction term may be present). Specifically, I want to compare intercepts between all pairwise combinations of groups.

The question entitled What test can I use to compare slopes from two or more regression models? shows how to test whether the slopes differ between all pairwise combinations of groups. However, I have not been able to find an equivalent way to test whether the intercepts differ between pairwise combinations of groups.

ANOVA (e.g., anova(lm(Sepal.Length ~ Petal.Width*Species, data = iris)))
will tell you whether the intercepts of each group differ relative to the baseline group (i.e., the first level in the grouping factor). However, is there a convenient way to perform all pairwise comparisons between groups?

Best Answer

I will answer the technical question, then try to talk you out of doing this.

The intercept is the predicted value when the abscissa is equal to zero. Hence, the intercepts in the example are obtained via:

> mod = lm(Sepal.Length ~ Petal.Width*Species, data = iris)

> library("emmeans")
> (emm = emmeans(mod, "Species", at = list(Petal.Width = 0)))
NOTE: Results may be misleading due to involvement in interactions
 Species    emmean    SE  df lower.CL upper.CL
 setosa       4.78 0.173 144     4.43     5.12
 versicolor   4.04 0.464 144     3.13     4.96
 virginica    5.27 0.509 144     4.26     6.28

Confidence level used: 0.95 

... and the comparisons thereof can be tested this way:

> pairs(emm)
 contrast               estimate    SE  df t.ratio p.value
 setosa - versicolor       0.733 0.495 144  1.480  0.3037 
 setosa - virginica       -0.492 0.538 144 -0.915  0.6316 
 versicolor - virginica   -1.225 0.689 144 -1.779  0.1804 

P value adjustment: tukey method for comparing a family of 3 estimates 

That said, it is an unusual instance that the intercept is an interesting or meaningful quantity to want to do inferences about. In many datasets, the intercept is a severe extrapolation because a predictor value of zero is distant from its observed values. Models are only approximations to the truth, and it is highly questionable that one should believe that the straight line you have fitted actually represents the trend at a distant point.

Thus, I urge you to re-think what you are trying to do and decide on what meaningful question you are really trying to answer.