Regression – How to Add Interaction Terms Affecting Power of Main Effects Not Included

generalized linear modelinteractionmathematical-statisticsregressionstatistical-power

Say we fit a regression model between the binary variables, $\mathbf{X} = (X_1, X_2, X_3) $ and a continuous response variable, $Y$ with a
$$E(Y| \mathbf{X}) = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \beta_3X_3 + \beta_{2,3}X_2 X_3.$$

Does including the interaction term $\beta_{2,3}$ affect the power to reject $\beta_1$ even though $X_1$ is not included in the interaction? Does including the interaction affect the estimation of $\beta_1$ in any way?

Can we say anything about the behavior of $\hat{\beta_1}$ if we add more binary variables, $X_2, \dots, X_J$, and add interaction terms? Can this be extended to generalized linear models? See the formula below:

$$g\{ E(Y_i | \mathbf{X}) \} = \beta_0 + \beta_1 + \sum_{j=2}^{J}\beta_j X_j + \sum_{j,k \neq 1} \beta_{j,k} X_j X_k.$$

Edit: I'll be more precise about what I mean by power. Say you have a test with 80% power to reject $H_0: \beta_1 = 0$ when $\beta_1 = \beta_{1}^*$ and the true model is $E(Y|X) = \beta_0 + \beta_1 X_1$. By adding additional $X_j$ where $j\neq 1$, you are fitting a misspecified model. Adding additional interaction terms results in a "more" misspecified model. Is there any effect of adding multiple interactions that don't involve $X_1$, e.g. $\sum_{j,k \neq 1} \beta_{j,k} X_j X_k$

Best Answer

Yes, it affects the power in three ways.

First, adding $X_2X_3$ to the model changes the true value of $\beta_1$ unless $X_2X_3$ is uncorrelated with $X_1$. In some designed experiments it would be natural for these to be uncorrelated, but in other sorts of data there typically isn't a reason to expect them to be uncorrelated. The coefficient could change by a large or small amount in either direction; the power could go up or down. The power might even become not-very-well-defined if the new value of $\beta_1$ was 0.

Second, again if $X_2X_3$ is correlated with $X_1$, putting it in the model will affect the variance of $\hat\beta_1$ because the variance of $\hat\beta_1$ is inversely proportional to the variance of $X_1$ conditional on everything else in the model. This effect will tend to reduce the power; the variance conditional on $X_2X_3$ is smaller than the variance not conditional on it [*]

Third, adding $X_2X_3$ to the model will tend to reduce the residual variance (if its coefficient is not zero), and so reduce the standard error of $\hat\beta_1$ and increase the power.

[*] I'm being loose with language, 'conditional' here is about linear projections rather than true conditional expectations.