Solved – VIF(collinearity) vs Correlation

correlationmulticollinearityvariance-inflation-factor

I am trying to understand the basic difference between both . As per what i have read through various links, previously asked questions and videos –

Correlation means – two variables vary together, if one changes so does the other but it does not imply collinearity or that one can explain the other.

VIF – Inflation in the variance of the regression coefficients ?( due to the col-linearity existing among predictors )

I am still confused around –

  1. Variance inflation we mean is how inflated the regression coefficients are due to two or more collinear predictors ?

  2. High VIF( > 10) implies high correlation but vice versa is not true ? Can this be explained with an example through variables ?

Best Answer

First, I think it is better to use condition indexes rather than VIF to diagnose collinearity. See the work of David Belsley or even (if you want a soporific) my dissertation (that link seems to have vanished; this one should work (I hope).

However, to get to your question: It is possible to have very low correlations among all variables but perfect collinearity. If you have 11 independent variables, 10 of which are independent and the 11th is the sum of the other 10, then correlations will be about 0.1 but collinearity is perfect. So, high VIF does not imply high correlations.

It is also true that you can have pretty high correlations without it creating troublesome collinearity, but this is trickier to show. See the references.