I am doing a logistic regression where all of my independent variables are categorical variables. Where some of the assumptions that a linear regression model makes can be waived for a logistic regression model, multicollinearity is still something that is to be tested for the sample data. How do I quantify the multicollinearity between several categorical variables? I looked into the available questions/answers here regarding this, few suggesting trying to fit an ill-fitted linear regression and look at the VIF to decide. Would it be enough? Or, is there any other specific methods for this purpose? Any suggestion is highly appreciated. Thank you in advance.
Solved – Identifying multicollinearity of categorical variables in a logistic regression
categorical datalogisticmulticollinearity
Related Question
- Solved – Multicollinearity among categorical variables – Is it normal
- Solved – How to calculate the variance inflation factor for a categorical predictor variable when examining multicollinearity in a linear regression model
- Solved – How to detect multicollinearity in a logistic regression where all the independent variables are categorical and binary
- Solved – check multicollinearity before regression in R
- Solved – Logistic Regression: multicollinearity and Kappa statistics
Best Answer
The VIF has been generalized to deal with logistic regression (assuming you mean a model with a binary dependent variable). In R, you can do this using the
vif
function in thecar
package.As @RichardHardy has said, it is not a test though. At the end you will get some GVIFs and still need to make some subjective decisions. The thing to keep in mind is that if you have high VIFs, it means that your standard errors will be inflated from some of your estimates, so results that could be meaningful may not be detected as being significant. The books and writings by John Fox, who also co-wrote the car package, are a great resource for understanding multicollinearity.