I've run a regression on U.S. counties, and am checking for collinearity in my 'independent' variables. Belsley, Kuh, and Welsch's Regression Diagnostics suggests looking at the Condition Index and Variance Decomposition Proportions:
library(perturb)
## colldiag(, scale=TRUE) for model with interaction
Condition
Index Variance Decomposition Proportions
(Intercept) inc09_10k unins09 sqmi_log pop10_perSqmi_log phys_per100k nppa_per100k black10_pct hisp10_pct elderly09_pct inc09_10k:unins09
1 1.000 0.000 0.000 0.000 0.000 0.001 0.002 0.003 0.002 0.002 0.001 0.000
2 3.130 0.000 0.000 0.000 0.000 0.002 0.053 0.011 0.148 0.231 0.000 0.000
3 3.305 0.000 0.000 0.000 0.000 0.000 0.095 0.072 0.351 0.003 0.000 0.000
4 3.839 0.000 0.000 0.000 0.001 0.000 0.143 0.002 0.105 0.280 0.009 0.000
5 5.547 0.000 0.002 0.000 0.000 0.050 0.093 0.592 0.084 0.005 0.002 0.000
6 7.981 0.000 0.005 0.006 0.001 0.150 0.560 0.256 0.002 0.040 0.026 0.001
7 11.170 0.000 0.009 0.003 0.000 0.046 0.000 0.018 0.003 0.250 0.272 0.035
8 12.766 0.000 0.050 0.029 0.015 0.309 0.023 0.043 0.220 0.094 0.005 0.002
9 18.800 0.009 0.017 0.003 0.209 0.001 0.002 0.001 0.047 0.006 0.430 0.041
10 40.827 0.134 0.159 0.163 0.555 0.283 0.015 0.001 0.035 0.008 0.186 0.238
11 76.709 0.855 0.759 0.796 0.219 0.157 0.013 0.002 0.004 0.080 0.069 0.683
## colldiag(, scale=TRUE) for model without interaction
Condition
Index Variance Decomposition Proportions
(Intercept) inc09_10k unins09 sqmi_log pop10_perSqmi_log phys_per100k nppa_per100k black10_pct hisp10_pct elderly09_pct
1 1.000 0.000 0.001 0.001 0.000 0.001 0.003 0.004 0.003 0.003 0.001
2 2.988 0.000 0.000 0.001 0.000 0.002 0.030 0.003 0.216 0.253 0.000
3 3.128 0.000 0.000 0.002 0.000 0.000 0.112 0.076 0.294 0.027 0.000
4 3.630 0.000 0.002 0.001 0.001 0.000 0.160 0.003 0.105 0.248 0.009
5 5.234 0.000 0.008 0.002 0.000 0.053 0.087 0.594 0.086 0.004 0.001
6 7.556 0.000 0.024 0.039 0.001 0.143 0.557 0.275 0.002 0.025 0.035
7 11.898 0.000 0.278 0.080 0.017 0.371 0.026 0.023 0.147 0.005 0.038
8 13.242 0.000 0.001 0.343 0.006 0.000 0.000 0.017 0.129 0.328 0.553
9 21.558 0.010 0.540 0.332 0.355 0.037 0.000 0.003 0.003 0.020 0.083
10 50.506 0.989 0.148 0.199 0.620 0.393 0.026 0.004 0.016 0.087 0.279
?HH::vif
suggests that VIFs >5 are problematic:
library(HH)
## vif() for model with interaction
inc09_10k unins09 sqmi_log pop10_perSqmi_log phys_per100k nppa_per100k black10_pct hisp10_pct
8.378646 16.329881 1.653584 2.744314 1.885095 1.471123 1.436229 1.789454
elderly09_pct inc09_10k:unins09
1.547234 11.590162
## vif() for model without interaction
inc09_10k unins09 sqmi_log pop10_perSqmi_log phys_per100k nppa_per100k black10_pct hisp10_pct
1.859426 2.378138 1.628817 2.716702 1.882828 1.471102 1.404482 1.772352
elderly09_pct
1.545867
Whereas John Fox's Regression Diagnostics suggests looking at the square root of the VIF:
library(car)
## sqrt(vif) for model with interaction
inc09_10k unins09 sqmi_log pop10_perSqmi_log phys_per100k nppa_per100k black10_pct hisp10_pct
2.894589 4.041025 1.285917 1.656597 1.372987 1.212898 1.198428 1.337705
elderly09_pct inc09_10k:unins09
1.243879 3.404433
## sqrt(vif) for model without interaction
inc09_10k unins09 sqmi_log pop10_perSqmi_log phys_per100k nppa_per100k black10_pct hisp10_pct
1.363608 1.542121 1.276251 1.648242 1.372162 1.212890 1.185108 1.331297
elderly09_pct
1.243329
In the first two cases (where a clear cutoff is suggested), the model is problematic only when the interaction term is included.
The model with the interaction term has until this point been my preferred specification.
I have two questions given this quirk of the data:
- Does an interaction term always worsen the collinearity of the data?
- Since the two variables without the interaction term are not above the threshold, am I ok using the model with the interaction term. Specifically, the reason I think this might be ok is that I'm using the King, Tomz, and Wittenberg (2000) method to interpret the coefficients (negative binomial model), where I generally hold the other coefficients at the mean, and then interpret what happens to predictions of my dependent variable when I move
inc09_10k
andunins09
around independently and jointly.
Best Answer
Yes, this is usually the case with non-centered interactions. A quick look at what happens to the correlation of two independent variables and their "interaction"
And then when you center them:
Incidentally, the same can happen with including polynomial terms (i.e., $X,~X^2,~...$) without first centering.
So you can give that a shot with your pair.
As to why centering helps - but let's go back to the definition of covariance
\begin{align} \text{Cov}(X,XY) &= E[(X-E(X))(XY-E(XY))] \\ &= E[(X-\mu_x)(XY-\mu_{xy})] \\ &= E[X^2Y-X\mu_{xy}-XY\mu_x+\mu_x\mu_{xy}] \\ &= E[X^2Y]-E[X]\mu_{xy}-E[XY]\mu_x+\mu_x\mu_{xy} \\ \end{align}
Even given independence of X and Y
\begin{align} \qquad\qquad\qquad\, &= E[X^2]E[Y]-\mu_x\mu_x\mu_y-\mu_x\mu_y\mu_x+\mu_x\mu_x\mu_y \\ &= (\sigma_x^2+\mu_x^2)\mu_y-\mu_x^2\mu_y \\ &= \sigma_x^2\mu_y \\ \end{align}
This doesn't related directly to your regression problem, since you probably don't have completely independent $X$ and $Y$, and since correlation between two explanatory variables doesn't always result in multicollinearity issues in regression. But it does show how an interaction between two non-centered independent variables causes correlation to show up, and that correlation could cause multicollinearity issues.
Intuitively to me, having non-centered variables interact simply means that when $X$ is big, then $XY$ is also going to be bigger on an absolute scale irrespective of $Y$, and so $X$ and $XY$ will end up correlated, and similarly for $Y$.