Solved – Logistic regression with poor goodness of fit (hosmer lemeshow)

logisticmulticollinearityr-squaredregression

I built a model with 9 categorical predictor variables. Using SPSS, my omnibus test was significant ($\chi^2$=220.01), my -2loglikelihood was 1335.2 (Nagelkerke $R^2$ 0.231), but my Hosmer and Lemeshow Test was significant (chi-sqr=16.2, p=0.042). My sample size is n=1199.

Is it problematic to proceed with this model, despite lack of fit?

I tried removing one of my binary predictor variables, and noticed that in the new model my Hosmer Lemeshow Test was significant (p=0.198), but my -2loglikelihood increased to 1442.2. Is there a trade off between Hosmer Lemeshow and -2loglikelihood?

How do I decide which model is appropriate?

My method for building this model was originally entering in all my predictor variables of interest, and examining the adjusted ORs of my variable of interest. Is my model a poor fit?

Additionally, how does one go about testing for multicollinearity when your predictor variables are all categorical (in other orders, calculating VIF is not possible). I was told in class that as long as the standard errors of Beta coefficients are less than 2 there is no reason to suspect multicollinearity, but I am not sure if this is sufficient?

Best Answer

The Hosmer and Lemeshow test is obsolete as has been discussed elsewhere on this site. See also Goodness-of-fit test in Logistic regression; which 'fit' do we want to test?.

My Regression Modeling Strategies course notes at https://hbiostat.org/rms present a hopefully coherent strategy for logistic regression model specification and validation. We almost never want to trade off predictive discrimination just to make a model have a better calibration curve.

Note that if the model contains only categorical variables and interactions among the variables are not needed, the model must fit the data and no calibration assessment is needed.

To check co-linearities I suggest variable clustering, e.g., the R Hmisc package varclus function.