Multiple Regression – Post-hoc Test for Differences in Slopes with Interaction

interactionmultiple regressionpost-hoc

I am looking for a robust way test for differences in slopes in a range of data. Here, I am showing R code to be clear about what I am attempting, but I believe that I am looking for a general answer (which I can then implement in R, if it is not available already).

As an example, I am using Edgar Anderson's "iris" data (builtin in R, and available here as a csv) to build a prediction of Petal.Length by Sepal.Width when I can visually see that the Species have different relationships:

So, I build a linear model with the interaction between Species and Sepal.Width, and can see that there is a significant interaction. In R, the model is built as:

irisLM <-
  lm(Petal.Length ~ Sepal.Width*Species
     , data = iris)

and anova(irisLM) gives the following table:

Response: Petal.Length
                     Df Sum Sq Mean Sq   F value    Pr(>F)    
Sepal.Width           1  85.23  85.232  574.1677 < 2.2e-16 ***
Species               2 355.76 177.879 1198.2894 < 2.2e-16 ***
Sepal.Width:Species   2   1.96   0.979    6.5973  0.001814 ** 
Residuals           144  21.38   0.148

I can run summary(irisLM) to get the full linear model as well:

Call:
lm(formula = Petal.Length ~ Sepal.Width * Species, data = iris)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.03337 -0.22012 -0.03026  0.17149  1.60468 

Coefficients:
                              Estimate Std. Error t value Pr(>|t|)   
(Intercept)                    1.18292    0.50072   2.362  0.01949 * 
Sepal.Width                    0.08141    0.14520   0.561  0.57589   
Speciesversicolor              0.75200    0.69983   1.075  0.28437   
Speciesvirginica               2.32798    0.71507   3.256  0.00141 **
Sepal.Width:Speciesversicolor  0.75797    0.22770   3.329  0.00111 **
Sepal.Width:Speciesvirginica   0.60490    0.22408   2.699  0.00778 **

Here, the coefficient Sepal.Width:Speciesversicolor is (effectively) testing the null-hypothesis of no difference in the slope for Iris versicolor and Iris setosa. When there are only two species, that is sufficient, but what about in cases like this where I also want to compare versicolor and virginica?

I know that I can change the order of the factor to treat each level as the baseline, but that seems incomplete, unelegant, and does not correct for multiple testing. (Not to mention that it is cumbersome for data with many levels and/or where levels may change from one analysis to the next.)

I have been searching extensively, and I have not found concrete answers. A few places have suggested using the estimates and standard errors to perform t-tests, but I am not sure that is a valid approach. (Further, it is unclear what degrees of freedom should be used for such tests.) I would hope that there would be a formal post-hoc test like Tukey's HSD for this (my understanding is that Tukey does not apply here because it is not a test of means; in any case, R throws an error if I try it with TukeyHSD(aov(irisLM), "Sepal.Width:Species")).

There are two questions that seem near duplicates, but they don't quite address what I want. This question asks something very similar to what I want, but the answers either suggest changing which level is the baseline, or how to work around the interaction term (but not test it directly). This question has very good answers, but they appear to only apply when both predictors are factors. They do not work for me with a continuous variable.

If it helps, here is a version of the coefficient table that estimates the value of each slope/intercept, instead of showing the difference from the baseline:

without_intercept <-
  lm(Petal.Length ~ Species/Sepal.Width - 1
     , data = iris)

summary(without_intercept)

Gives

Coefficients:
                              Estimate Std. Error t value Pr(>|t|)    
Speciessetosa                  1.18292    0.50072   2.362 0.019494 *  
Speciesversicolor              1.93492    0.48891   3.958 0.000118 ***
Speciesvirginica               3.51090    0.51049   6.877 1.72e-10 ***
Speciessetosa:Sepal.Width      0.08141    0.14520   0.561 0.575889    
Speciesversicolor:Sepal.Width  0.83938    0.17540   4.785 4.18e-06 ***
Speciesvirginica:Sepal.Width   0.68632    0.17067   4.021 9.30e-05 ***

Best Answer

The lsmeans package does provide for this type of comparison. For the example at hand,

library("lsmeans")
iris.lst <- lstrends(irisLM, ~ Species, var = "Sepal.Width")
iris.lst          # slope estimates and CIs
pairs(iris.lst)   # comparisons

Related Solutions

Solved – Interpreting a significant 2-way interaction with post-hoc tests

Pooling across the remaining group was correct. You don't need anymore tests. Take choice 1. Your interaction means the effect of difficulty (hard - easy) in controls is different from the effect of difficulty in clinical. Since what it means is exactly what you want to know a post hoc is completely unnecessary.

You might want to see Gelman and Stern (2006) for a lesson in why the path you were thinking of taking with post hocs was faulty logic.

Gelman, A., & Stern, H. (2006). The difference between “significant” and “not significant” is not itself statistically significant. The American Statistician, 60(4), 328-331. doi: 10.1198/000313006X152649

Some general information might help others in the same situation. There seems to be a broadly accepted mistaken belief that there must be a significant simple effect when you have a significant interaction. Not only is this false but there is no guarantee of any significant simple effect in an ANOVA with more than 2 levels (in which case it's just an alternative to a t-test). Interactions are about differences among differences. None of the individual differences needs to be significant, or even close to significant, for a significant interaction. Consider a situation where one effect is +5 and another is -5. The difference between those effects is 10, which is much larger than either simple effect. Therefore, it would be very easy to not have a significant simple effect of 5 but a significant interaction of 10.

With something as simple as a 2x2 designs one should never perform a post hoc on the interaction because the interaction told you all of the necessary information. Even with slightly more complex designs post hoc tests are usually unnecessary, especially when one variable only has 2 levels. People generally do far too many post hoc tests following ANOVAs. In school they tell you to do an ANOVA to avoid multiple comparison problems. Of what benefit is that when, after you perform the ANOVA, you perform a bunch of comparisons? Interpret the ANOVA first.

Solved – Post hoc test other than Tukey for factorial ANOVA data in R

You might want to look at the lsmeans package or the multcomp package. Here is some demonstration R code to show you some of the options you have.

Load both packages. Also, the car library provides some functionality for carrying out analysis of variance that is sometimes helpful.

# Load handy packages.
library(car)
library(lsmeans)
library(multcomp)

Then read the data and check what shape it arrived.

# Read the data.
Example <- read.csv("Example.csv")
# What is the structure of the data.
str(Example)

Looking at the structure of the data, the "Control" value of Treatment is the first level of the factor. That is handy because that is usually the default reference value.

'data.frame':   27 obs. of  3 variables:
 $ Treatment  : Factor w/ 3 levels "Control","Nitrogen",..: 3 3 3 3 3 3 3 3 3 2 ...
 $ Stage      : Factor w/ 3 levels "Green","Pink",..: 1 1 1 2 2 2 3 3 3 1 ...
 $ Chlorophyll: num  0.2 0.3 0.4 0.5 0.3 0.2 0.5 0.6 0.7 0.4 ...

Perform the analysis and a little bit of model-checking.

# Fit the two-way factorial model.
fit <- lm(Chlorophyll ~ Treatment + Stage + Treatment:Stage, Example)
# Look at the model goodness of fit.
plot(fit)
shapiro.test(residuals(fit))

The model fits pretty well. Good job on the fake data set! Now look at the results.

# Perform an analysis of variance.
Anova(fit)

No matter how you look at it, there is a two-way interaction effect in these data.

Anova Table (Type II tests)

Response: Chlorophyll
                 Sum Sq Df F value   Pr(>F)   
Treatment       0.00519  2  0.1273 0.881279   
Stage           0.12741  2  3.1273 0.068283 . 
Treatment:Stage 0.53259  4  6.5364 0.001972 **
Residuals       0.36667 18                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can use the lsmip() function to plot the interaction effects.

# Look at interaction effects via least squares means.
lsmip(fit, Stage ~ Treatment)

If we ignore the interaction with Stage, we can look at all pairwise comparisons of Treatment using the Tukey adjustment for multiple testing. The [[2]] just picks off the second part of the list of results.

# Look at Treatment averaged over levels of Stage.
lsmeans(fit, pairwise ~ Treatment)[[2]]

However, for these data, it would be best to take into account the interaction. We can "slice" the interaction by Stage and compare all levels of Treatment to each other for each Stage.

# Compare sliced least squares means using the Tukey method.
fit.tukey <- lsmeans(fit, pairwise ~ Treatment | Stage)[[2]]
fit.tukey
cld(fit.tukey)

The cld() function provides the usual letter codes. We can see the interaction effect in the differential pattern of significant differences among the groups.

Stage = Green:
 contrast                estimate        SE df t.ratio p.value .group
 Control - Nitrogen -1.188445e-17 0.1165343 18   0.000  1.0000  1    
 Control - Salt      3.333333e-01 0.1165343 18   2.860  0.0268   2   
 Nitrogen - Salt     3.333333e-01 0.1165343 18   2.860  0.0268  12   

Stage = Pink:
 contrast                estimate        SE df t.ratio p.value .group
 Nitrogen - Salt     3.962759e-17 0.1165343 18   0.000  1.0000  1    
 Control - Nitrogen  1.666667e-01 0.1165343 18   1.430  0.3470  1    
 Control - Salt      1.666667e-01 0.1165343 18   1.430  0.3470  1    

Stage = Red:
 contrast                estimate        SE df t.ratio p.value .group
 Control - Salt     -4.000000e-01 0.1165343 18  -3.432  0.0079  1    
 Nitrogen - Salt    -3.000000e-01 0.1165343 18  -2.574  0.0478  12   
 Control - Nitrogen -1.000000e-01 0.1165343 18  -0.858  0.6727   2   

P value adjustment: tukey method for a family of 3 means 
significance level used: alpha = 0.05

Using Dunnett's test might be better if you don't care about all pairwise comparisons. This code compares against the reference level mentioned above, but you have much more flexibility if needed.

# Compare sliced least squares means via Dunnett's method.
fit.lsm <- lsmeans(fit, "Treatment", by=c("Stage"))
contrast(fit.lsm, "trt.vs.ctrl")

Results are pretty similar to those obtained using the Tukey method in this case.

Best Answer

Related Solutions

Solved – Interpreting a significant 2-way interaction with post-hoc tests

Solved – Post hoc test other than Tukey for factorial ANOVA data in R

Related Question