Is the variance inflation factor useful for GLM models. Below example shows OLS is showing VIF>5, but GLM lower. GLM shows instability in the coefficients between train and test set.
> library(MASS)
> set.seed(1)
> mu <- rep(0,4)
> Sigma <- matrix(.9, nrow=4, ncol=4)
> diag(Sigma) <- 1
> rawvars <- mvrnorm(n=1000, mu=mu, Sigma=Sigma)
> d <- as.ordered( as.numeric(rawvars[,1]>0.5) )
> d[1:200] <- 1
> #
> df <- data.frame(rawvars, d)
>
> ind <- sample(1:nrow(df), 500)
> train <- df[ind,]
> test <- df[-ind,]
>
> fit.lm = lm(X4~X1+X2+X3, train)
>
> fit.lm.test = lm(X4~X1+X2+X3, test)
> summary(fit.lm)
Call:
lm(formula = X4 ~ X1 + X2 + X3, data = train)
Residuals:
Min 1Q Median 3Q Max
-1.16108 -0.24883 0.01356 0.25803 1.43455
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.02638 0.01698 1.554 0.121
X1 0.30456 0.04243 7.178 2.6e-12 ***
X2 0.33006 0.04285 7.703 7.3e-14 ***
X3 0.33138 0.04477 7.402 5.8e-13 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.3788 on 496 degrees of freedom
Multiple R-squared: 0.8553, Adjusted R-squared: 0.8544
F-statistic: 977.1 on 3 and 496 DF, p-value: < 2.2e-16
> summary(fit.lm.test)
Call:
lm(formula = X4 ~ X1 + X2 + X3, data = test)
Residuals:
Min 1Q Median 3Q Max
-1.06717 -0.25019 0.00594 0.26794 1.14282
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.02660 0.01669 -1.594 0.112
X1 0.33038 0.04216 7.836 2.85e-14 ***
X2 0.35636 0.04044 8.811 < 2e-16 ***
X3 0.27625 0.04125 6.697 5.78e-11 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.3727 on 496 degrees of freedom
Multiple R-squared: 0.8806, Adjusted R-squared: 0.8799
F-statistic: 1219 on 3 and 496 DF, p-value: < 2.2e-16
>
> car::vif(fit.lm)
X1 X2 X3
6.260418 6.329868 6.480232
> car::vif(fit.lm.test)
X1 X2 X3
7.252877 7.097506 7.138707
>
> fit.glm <- glm(d~X1+X2+X3, train, family = binomial(link="probit"))
> fit.glm.test <- glm(d~X1+X2+X3, test, family = binomial(link="probit"))
> summary(fit.glm)
Call:
glm(formula = d ~ X1 + X2 + X3, family = binomial(link = "probit"),
data = train)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.7122 -0.8662 -0.2447 0.7068 3.7270
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.08018 0.06551 -1.224 0.2210
X1 0.78783 0.17028 4.627 3.71e-06 ***
X2 0.31422 0.16544 1.899 0.0575 .
X3 -0.03526 0.17270 -0.204 0.8382
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 689.62 on 499 degrees of freedom
Residual deviance: 475.49 on 496 degrees of freedom
AIC: 483.49
Number of Fisher Scoring iterations: 5
> summary(fit.glm.test)
Call:
glm(formula = d ~ X1 + X2 + X3, family = binomial(link = "probit"),
data = test)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.4552 -0.9145 -0.3013 0.7889 2.8701
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.10849 0.06398 -1.696 0.0899 .
X1 0.77867 0.16649 4.677 2.91e-06 ***
X2 0.01052 0.15355 0.068 0.9454
X3 0.09631 0.15826 0.609 0.5428
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 690.83 on 499 degrees of freedom
Residual deviance: 505.10 on 496 degrees of freedom
AIC: 513.1
Number of Fisher Scoring iterations: 5
>
> car::vif(fit.glm)
X1 X2 X3
3.826678 3.734214 4.022859
> car::vif(fit.glm.test)
X1 X2 X3
4.633032 4.617873 4.717568
Best Answer
If we look at the function
then it calls
vcov
which will differ for aglm
thenlm
. In theglm
case it depends on the outcome. Thus, you get the different results. All the above is consistent with the 1992 articlein the linear model case. See particularly Equation (10) and
To the question
Then I gather that the results in the 1992 article may still hold asymptotically. However, some pen and paper is likely need to justify this claim and I am may be wrong.