Solved – Is a log transformation of predictors a suitable way of dealing with multicollinearity in multiple regression

multicollinearitymultiple regressionpredictorvariance-inflation-factor

Suppose two independent variables in the linear regression initially have very high correlation of 0.95. This introduces severe multicollinearity into the model (as indicated by very high variance inflation factors). Can one take natural logarithm of each of them (this decreases correlation between them to 0.75), and use them in the same regression? VIFs do not indicate multicollinearity issues then. Is it a reasonable approach?

Best Answer

Sometimes variables are just correlated. It's not necessarily bad, it's just the way that it is.

If you're doing regression, controlling for one variable, that's because you want to control for it. If you do a log transformation, you'll alter the meaning of the variables, you'll alter the distributions of the residuals, and you might introduce non-linear effects. If that all makes sense, go ahead and log them.

In addition, if log transformation reduces the correlation that much, they must have slightly strange distributions. Again, that might be OK, but it might also be something you'll want to look into.

Related Question