Solved – Difference between Variance Inflation Factor (VIF) and kappa in R

multicollinearitymultiple regressionrvariance-inflation-factor

I am running a regression analyis in r:

fit <- lm(Cost ~ Slope + YardDist, data = test)

I want to test the two independent variables for multicollinearity. I tested it with vif() (from the car package) and kappa().

> vif(fit)
   Slope YardDist 
1.000121 1.000121 
> kappa(fit)
[1] 11631.87

VIF tells me there is no multicollinearity and kappa tells me there is very high multicollinearity.
What is the difference between both and which one is 'right'?

Best Answer

If you only have two variables, you can just check the correlation between them. The VIF is:
$$ \text{VIF}=\frac{1}{\text{tolerance}}=\frac{1}{1-r^2} $$ On the other hand, kappa, is the condition number; that is: $$ \sqrt{\frac{\text{max(eigenvalue(X'X))}}{\text{min(eigenvalue(X'X))}}} $$ One thing that is often recommended with kappa is to center your variables first (note that there is difference of opinion about this recommendation). If your variables are far from 0, the sampling distributions of the $\beta_j$s will be correlated with the sampling distribution of $\beta_0$ (i.e., the intercept). I suspect that's what you are seeing here.

It might help you to read my question here: Is there a reason to prefer a specific measure of multicollinearity?

Related Question