Solved – Variation explained in ordinal logistic regression models

logisticordered-logitrr-squaredregression

I have made these three ordinal logistic regression models:

model1 <-  polr(as.factor(carb) ~ mpg,  Hess = T, mtcars)
model2 <-  polr(as.factor(carb) ~ hp,   Hess = T, mtcars)
model3 <-  polr(as.factor(carb) ~ drat, Hess = T, mtcars)

To figure out if the models are a good fit to the data, I calculated the proportion of variation explained like this:

model_null <-  polr(as.factor(carb) ~ 1, Hess = T, mtcars)

1-(model1$deviance/model_null$deviance)
0.1512784
1-(model2$deviance/model_null$deviance)
0.2520109
1-(model3$deviance/model_null$deviance)
0.003453936

Questions:

  1. Why doesn't summary give null deviance?

  2. Have I calculated the proportion of variation explained correctly?

  3. Am I right in saying model1 and model3 explain little variation in carb, but model2 explains 25% of the variation in carb?

Best Answer

Even for logistic regression with a dichotomous DV, there is no exact equivalent of $R^2$ (proportion of variance explained) nor any consensus on which approximation is best. Here is Paul Allison's explanation. However, the version Allison likes best is that of Tjur:

But there’s another $R^2$, recently proposed by Tjur (2009), that I’m inclined to prefer over McFadden’s. It has a lot of intuitive appeal, its upper bound is 1.0, and it’s closely related to $R^2$ definitions for linear models. It’s also easy to calculate.

The definition is very simple. For each of the two categories of the dependent variable, calculate the mean of the predicted probabilities of an event. Then, take the difference between those two means. That’s it!

Unfortunately, with more than two categories, there will be more than one difference; perhaps, however, some statistic based on this could be used. However, Allison is not optimistic about this:

Another potential complaint is that the Tjur $R^2$ cannot be easily generalized to ordinal or nominal logistic regression. For McFadden and Cox-Snell, the generalization is straightforward.

Of those two, Allison now prefers McFadden:

Here are the details. Logistic regression is, of course, estimated by maximizing the likelihood function. Let $L_0$ be the value of the likelihood function for a model with no predictors, and let $L_M$ be the likelihood for the model being estimated. McFadden’s $R^2$ is defined as

$R^2_{McF} = 1\ –\ \ln(L_M) / \ln(L_0)$