Hierarchical Regression – Interpreting Hierarchical Regression Models with Interaction Terms

interactioninterpretationmultiple regressionrregression

I am running multiple regression to test my hypothesis, which includes interaction terms. I have some control variables and three key predictors A, B and C. I used hierarchical regression models which adds the predictors step by step.

Here is my sequence of models:

model 1: only include control variables to see how they relates to the dependent variable

model 2: add A B and C based on model 1

model 3: add the interaction of A and B based on model 2

model 4: add the interaction of AB and BC based on model 3

Q1: In model 2, predictor B is significant, but its not significant in model 3 and 4 when the interaction terms were added. Should I say B has a significant impact on my dependent variable?

Q2: The significance of some control variables varies between different model. How to combine the results when interpreting them?

Q3: Do I have to include the interaction term of ABC, even though its not one of my hypothesis?

Thanks a lot for your help!!!

Best Answer

With respect to your Question 1, the apparent "significance" of an individual coefficient involved in an interaction depends on how its interacting predictor variables are coded. See this page and this page and this page, among others on this site. So that problem should be ignored.

With respect to Question 2, what you want is a single model that appropriately describes your data. Your multiple models arose from your step-by-step variable selection based on "statistical significance" at each step. That is not a good idea. The "statistical significance" values in such an approach aren't even correct, as they haven't taken into account your use of the outcomes to select the predictors.

Frank Harrell describes a much better way to proceed in his course notes and book, in particular Chapter 4. Decide on the number of degrees of freedom that you can spend in terms of estimating coefficients, decide where to spend them in the model (on non-linear outcome associations with continuous covariates, interactions expected to be of interest, etc), and spend them in a single model.

With respect to Question 3, see Harrell's modeling approach outlined above. If you don't want to spend extra degrees of freedom on the 3-way interaction you don't necessarily have to. If you have enough data, however, you might find that beneficial so that you can anticipate potential arguments from skeptical reviewers of your work.

Related Question