Regression – Interpreting Ceteris Paribus with Multiple 3-Way Interactions

interactionregression

I have a regression model that includes multiple 3-way interactions, but where two of the interacting variables occur in all interactions ( and ) and only one changes: let's call these variables of interest, and denote them as

My question is about the interpretation of Ceteris Paribus (or holding other things constant) when analysing one of the variables of interests (more on that further down).

Without including additional continuous or categorical independent variables I may want to control for, the regression model is

$$ Y = \beta_0 + \beta_1 X + \beta_2Z + \beta_3 XZ + \underbrace{\beta_4 W_1 + \beta_5 XW_1 + \beta_6 ZW_1 + \beta_7 XZW_1}_\text{First variable of interest} + \underbrace{\beta_8 W_2 + \beta_9 XW_2 + \beta_{10} ZW_2 + \beta_{11} XZW_2}_\text{Second variable of interest} + \underbrace{\beta_{12} W_3 + \beta_{13} XW_3 + \beta_{14} ZW_3 + \beta_{15} XZW_3}_\text{Third variable of interest}$$

Where and are binary variables that can be either be high or low, while is continuous (time).

If we take one of the variables of interest, , we can define the different slopes as follows (same goes for )

And thereafter compute the mean pairwise difference between any two slopes (here also shown only for )

What I am wondering about is the interpretation of Ceteris Paribus (or holding other things constant) when analysing one of the variables of interests. Say for example that I am looking at . In the case of a continuous independent variables that was not part of the focal interactions, it would be at any value of that variable, or regardless of the levels of a categorical independent variable (also not part of the focal interactions).

But if I analysed the same comparison (i.e. ), at what level are and at? Are they at their reference categories (i.e. low) or it doesn’t matter?

Best Answer

With treatment/dummy coding, the value of the individual coefficient for a predictor involved in higher-level interactions is the value when all of its interacting predictors are at their reference levels. So the $\beta_2$ coefficient for $Z$ in your setup is the value when all its interacting binary predictors $W_1$, $W_2$ and $W_3$ are at their reference levels (taken to be 0) and when the continuous predictor $X$ has a value of 0.

You can see this for the $Z_{\text{high}}$ versus $Z_{\text{low}}$ comparison by examining all of the terms involving $Z$, as the other terms cancel in that comparison:

$$ \beta_2Z + \beta_3 XZ + \beta_6 ZW_1 + \beta_7 XZW_1 + \beta_{10} ZW_2 + \beta_{11} XZW_2 + \beta_{14} ZW_3 + \beta_{15} XZW_3. $$

Your proposed formula for $Z_{\text{high}}$ versus $Z_{\text{low}}$ at $W_1 = 0$, $\beta_2 + \beta_3X$, thus only holds for the situation where $W_2$ and $W_3$ are also at zero. Otherwise, you need to include the further interaction terms involving $Z$ and non-zero values of $W_2$ and/or $W_3$.

Related Question