Solved – Unstable coefficient in regression without high correlation between variables

correlationmulticollinearityregression coefficients

I am estimating a linear regression: $Y=f(X_1,X_2,X_3,X_4,X_5)$. My test shows that when the equation includes $X_4$ and $X_5$ only, $X_4$ is not statistically significant ($t$-value=1.26). However, When $X_1$,$X_2$,and $X_3$ are added, its significance is increased to significant level ($t$-value =2.36). In another words, the significance of $X_4$ depends on the presence of $X_1$, $X_2$ and $X_3$ in the model.

I tested the correlation between $X_4$ and $X_1$, $X_2$, and $X_3$. Bivariate correlations between them are all below 0.5. If I understand it correctly, multicollinearity should not be a concern at such correlation level.

My question is: Should I drop X4 from my final model? If no, how should I explain its significance?

Best Answer

There are a few different things that could be going on here:

One, the multicolinearity cannot be assessed based only on bivariate correlations. Consider the case where x1, x2, and x3 are all generated as independent normal random vanriables and x4 is the sum of x1, x2, and x3. There is definite colinearity with x4, but the bivariate colinearities are not strong (I did a quick example and the correlations ranged from 0.55 to 0.63). So you should use a better measure of colinearity than just bivariate correlations.

Two, it may be that x1, x2, and x3 explain quite a bit of the variation in the response variable and are independent of x4, so that without x1-x3 the effect size of x4 is small compared to the residual variation, but when you include x1-x3 the residual variation is much smaller and the effect size of x4 stays about the same, but is now large relative to the residual variation and therefore significant.

There are probably other explanations as well. You should really explore your data and all the relationships, then also consider the science behind the data, which model to use should be the one that makes the most sense scientifically.

Related Question