Solved – Ordinal logistic regression interpretation of polr summary in R

ordered-logitpolrrregression

I am trying to interpret the coefficients for the polr model. Where each predictor has three levels, Dissatisfied, Neutral, and Satisfied. Dissatisfied is the baseline. The dependent variable has the same three levels. I need help interpreting the coefficients.

library(MASS)
dNom <- polr(overallPN ~ accessPN + timelyPN + fairlyPN + returnsPN + 
    updatedPN + feesPN + ethicalPN, data= pn[pn$provider=="A",], 
Hess = TRUE)
summary(dNom)
Call:
polr(formula = overallPN ~ accessPN + timelyPN + fairlyPN + 
returnsPN + updatedPN + feesPN + ethicalPN, 
data = pn[pn$provider == 
    "A", ], Hess = T)

Coefficients:
                     Value Std. Error t value
accessPNNeutral    -0.9377     0.9111 -1.0292
accessPNSatisfied   0.3020     0.7100  0.4254
timelyPNNeutral     0.6333     0.8467  0.7480
timelyPNSatisfied   1.3681     0.7945  1.7220
fairlyPNNeutral     1.0702     1.0036  1.0664
fairlyPNSatisfied   0.5618     0.8365  0.6716
returnsPNNeutral    0.1903     0.6687  0.2846
returnsPNSatisfied  1.1044     0.7848  1.4073
updatedPNNeutral    0.6112     0.6528  0.9364
updatedPNSatisfied  1.6093     0.6734  2.3897
feesPNNeutral       0.3570     0.6996  0.5104
feesPNSatisfied     0.6713     0.7312  0.9181
ethicalPNNeutral    1.3364     0.7466  1.7901
ethicalPNSatisfied  1.0362     0.7072  1.4653

Intercepts:
                     Value   Std. Error t value
Dissatisfied|Neutral  1.4037  0.6436     2.1812
Neutral|Satisfied     3.6533  0.7400     4.9367

Best Answer

As you seem to recognise already, ordinal logistic regression is an extension of simple logistic regression, so the interpretation is quite similar. Ordinal logistic regression draws multiple logistic curves (or hypersurfaces in a multi-variate case as yours) through your data, one per each successive dichotomy, with all curves having identical shape, except in the intercept term.

Let's simplify your problem to only one, continuous predictor variable, say, accessPNcont. Your output would look something like this (the numbers are purely fictional and serve only for illustration purposes):

Call:
polr(formula = overallPN ~ accessPNcont, data = pn[pn$provider == "A", ], Hess = T)

Coefficients:
                     Value Std. Error t value
accessPNcont        0.9377     0.9111 -1.0292

Intercepts:
                      Value   Std. Error t value
Dissatisfied|Neutral  -1.4037  0.6436     2.1812
Neutral|Satisfied     -3.6533  0.7400     4.9367

This can be graphically represented as in the figure below:

polr-cont

You have two curves, identical in shape, only that the blue one is shifted to the right. The red one represents the probability of the overallPN being at least "Neutral", while the blue one represents the probability of the overallPN being at least "Satisfied" (which is in this case, the same as being exactly "Satisfied"). One way of saying it could be:

A unit change in accessPNcont increases the odds of overallPN achieving the leap into a higher category by a factor of $e^{0.9377} = 2.5541$.

This odds ratio, $2.5541$ is constant for all categories. This is the essence of proportional odds.

Now, for your real data, you have six ordinal predictor variables, which you encode as factors (something I wouldn't do). You can interpret all the variables in the same way. Since you use factors, you probably would't say "unit change" but use something like "category switch". For example:

A change in response regarding access from "dissatisfied" to "neutral" decreases the odds of switching into a higher "overall" category by a factor of $e^{-0.9377} = 0.3915$.

This is very similar to what you say in your comment, only somewhat clearer for my taste.

Now, let me offer you unsolicited advice: When you use factors in regression, R will automatically convert them into unrelated binary dummy variables. For ordinal variables this is not quite correct. You cannot be at the same time "neutral" and "satisfied". Being "satisfied" implies being already more than "neutral".

Normally you'd expect the coefficient for "satisfied" to be larger than for "neutral", but in your model this is not always the case. See for example "fairly": Being "neutral" towards it (as opposed to "dissatisfied") increases the odds of "overall" by $\exp(1.0702)$, but being "satisfied" (again, as opposed to "dissatisfied"), only by $\exp(0.5618)$. This non-monotonous behaviour is likely to require an explanation and probably does not correctly reflect the real process in the population.

To avoid such problems, you could either:

  1. use numerical, instead of ordinal predictors (say 0, 1, and 2 for "dissatisfied", "neutral", and "satisfied"), or
  2. manually encode each predictor in two binary dummy variables, with the meaning "at least neutral" (i.e. "neutral or satisfied") and "satisfied". A "dissatisfied" response would be encoded as $(0, 0)$, a "neutral" as $(1, 0)$, and a "satisfied" as $(1, 1)$.

The first approach has the advantage of keeping the number of predictors low, reducing the danger of overfitting and being more likely to find significant predictors. The disadvantage is that it implicitly assumes the same "step size" between "dissatisfied" and "neutral" as between "neutral" and "satisfied". For the second approach it's the other way round.