Solved – How to deal with failing the proportional odds assumption in ordinal logistic regression

assumptionslogisticodds-ratioordered-logitregression

I am attempting to do ordinal logistic regression but I keep failing to pass the proportional odds assumption. Almost all of my features are shown to have high significance, but the only model that I can fit that passes the Chi-Squared test for proportional odds is rather trivial.

What is the typical way of rectifying this? Is it like linear regression where I can add more interactions or higher order terms to rectify this? If so, how do I go about finding (visually) the ideal adjustments to make to my model? (Like I could plot the residuals against a predictor in linear regression to determine where linearity fails. Is there an equivalent for ORL?)

Best Answer

It's true, in general, that "calibration" tests will behave differently under a different set of regression adjustments. A lot of statisticians will begin and end fishing expeditions by choosing the magic set of adjustments that leads to a clean pass for model based assumptions. But adjustments also change the scientific question, and shouldn't be chosen on the basis of meeting "calibration" requirements. The proportional odds test is just such a goodness of fit test. Here, I mean calibration as some quasi-counterpart to goodness of fit where significance means rejection of model based assumptions. Of course, in large sample sizes almost all calibration tests fail not because the model is invalid but because the test is overpowered and a 0.05 significance level is arbitrary and unuseful.

With a larger number of adjustments in the model, the proportional odds model tends toward a saturated model where each stratum specific probability is close or identical to the fitted values. This is a problem when the data structure is sparse.

One way around the issue of non-proportional odds is to just fit the log-linear model. This is the most general form of analysis of categorical data. With this you can summarize any number of summaries: the multinomial model, or odds ratios predicting a response for a cumulatively higher response for each reference category.