I ask the question based on a current case, but I would really appreciate a general answer, because it has been bugging me for some time:
I'm running regressions with interaction effects.
How do I test if the interaction is significant?
Option A: I look at the interaction coefficients. If they are significant, the interaction is significant.
Option B: I run two regression models: One with all main effects and one with the main effects and interaction terms. If the explanatory power of the interaction model is significantly higher, I interpret the interaction.
(e.g., comparing the two models with the anova() function in R; running an F test)
Many of my colleagues choose option A, but I seem to recall that my statistics instructor insisted that option B is preferable.
This question has become pertinent, because I have some models where the interaction term is significant, but the explanatory power of the models with and without the interaction is not significantly different.
Best Answer
Option B.
Option A can be inconsistent, especially if there are categorical variables. Just by changing the reference group can drastically change the p-values of each dummy's and each dummy's interaction term in the regression output.
Option B provides an overall test and it'd be same no matter which reference group is selected.
For normal continuous predictor the interaction p-value is the same as the F-test p-value, assuming both $x_1$ and $x_2$ are continuous, the no-interaction model is:
$$y = \beta_0 + \beta_1 x_1 + \beta_2 x_2$$
And the interaction model is:
$$y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_3 (x_1 \times x_2)$$
These two correspond to the "reduced" and "full" model for the F-test, and since the extra term is only $\beta_3$, the extra sum of squares in the full model is solely contributed by it; meaning that its own p-value will be the same as the p-value of the F-test.