Solved – Dealing with multicollinearity when removing a highly collinear predictor reduces significance

interactionmulticollinearityregression

In my experience with dealing with multicollinearity, often removing one collinear variable from the model results in the other collinear variable(s) becoming significant (assuming that all the collinear variables are significantly correlated with the dependent variable bivariately.

However, I've recently encountered a situation where I have a model shown below in which $X_2$ and $X_3$ are highly correlated ($r > 0.95$) and the tolerance scores for the two variables are below $0.1$

$$
Y = \beta_0\ + \beta_1X_1\ + \beta_2X_2\ + \beta_3X_3\ +\beta_4X_1X_2\ +\beta_5X_1X_3\
$$

*all variables are continuous.

The results of the regression model show that all 5 slopes are significant (3 slopes for the main effects and 2 slopes for the interactions). One of the possible solutions is to remove one of the highly collinear predictors. If I remove one of them – say $X_2$, I get a new model as shown below.

$$
Y = \beta_0^\prime + \beta_1^\prime X_1\ + \beta_3^\prime X_3\ +\beta_5^\prime X_1X_3\
$$

After $X_2$ is removed, while $\beta_0^\prime$, $\beta_1^\prime$, and $\beta_3^\prime$ are significant, crucially, $\beta_5^\prime$, the slope for the interaction term, is no longer significant. The same thing happens if I try to remove $X_3$. So I wonder what may be the reason that causes this pattern to arise and how it can be potentially dealt with. Thank you in advance!

Best Answer

Correlation between two independent variables is not necessarily a sign of troublesome colinearity. The guru of colinearity, David Belsley, has shown this in his books: Conditioning Diagnostics: Colinearity and Weak Data in Regression and Regression Diagnostics: Identifying Influential Points and Sources of Collinearity.

In the comments, @Whuber points out that colinearity is not always a problem that has to be dealt with and that your maximum condition index indicates that, here, it is not even a problem at all.

At the other end, it is also possible to have very high colinearity without any high correlations. One example of this is if there are 10 IVs, 9 of which are independent and the 10th is the sum of the other 9.

In addition to condition indexes (and developed after Belsley's books were written) in R there is the perturbpackage that examines the problem of colinearity by adding small amounts of random noise to the input data and seeing what happens; one of the problems that colinearity can cause is that small changes to the input data can lead to huge changes in the regression results. In one of Belsley's books he gives an example where changing the data in the third or fourth significant digit reverses the signs of regression coefficients.