I am trying to interpret the coefficients for the polr model. Where each predictor has three levels, Dissatisfied, Neutral, and Satisfied. Dissatisfied is the baseline. The dependent variable has the same three levels. I need help interpreting the coefficients.
library(MASS) dNom <- polr(overallPN ~ accessPN + timelyPN + fairlyPN + returnsPN + updatedPN + feesPN + ethicalPN, data= pn[pn$provider=="A",], Hess = TRUE) summary(dNom) Call: polr(formula = overallPN ~ accessPN + timelyPN + fairlyPN + returnsPN + updatedPN + feesPN + ethicalPN, data = pn[pn$provider == "A", ], Hess = T) Coefficients: Value Std. Error t value accessPNNeutral -0.9377 0.9111 -1.0292 accessPNSatisfied 0.3020 0.7100 0.4254 timelyPNNeutral 0.6333 0.8467 0.7480 timelyPNSatisfied 1.3681 0.7945 1.7220 fairlyPNNeutral 1.0702 1.0036 1.0664 fairlyPNSatisfied 0.5618 0.8365 0.6716 returnsPNNeutral 0.1903 0.6687 0.2846 returnsPNSatisfied 1.1044 0.7848 1.4073 updatedPNNeutral 0.6112 0.6528 0.9364 updatedPNSatisfied 1.6093 0.6734 2.3897 feesPNNeutral 0.3570 0.6996 0.5104 feesPNSatisfied 0.6713 0.7312 0.9181 ethicalPNNeutral 1.3364 0.7466 1.7901 ethicalPNSatisfied 1.0362 0.7072 1.4653 Intercepts: Value Std. Error t value Dissatisfied|Neutral 1.4037 0.6436 2.1812 Neutral|Satisfied 3.6533 0.7400 4.9367
As you seem to recognise already, ordinal logistic regression is an extension of simple logistic regression, so the interpretation is quite similar. Ordinal logistic regression draws multiple logistic curves (or hypersurfaces in a multi-variate case as yours) through your data, one per each successive dichotomy, with all curves having identical shape, except in the intercept term.
Let's simplify your problem to only one, continuous predictor variable, say,
accessPNcont. Your output would look something like this (the numbers are purely fictional and serve only for illustration purposes):
This can be graphically represented as in the figure below:
You have two curves, identical in shape, only that the blue one is shifted to the right. The red one represents the probability of the
overallPNbeing at least "Neutral", while the blue one represents the probability of the
overallPNbeing at least "Satisfied" (which is in this case, the same as being exactly "Satisfied"). One way of saying it could be:
This odds ratio, $2.5541$ is constant for all categories. This is the essence of proportional odds.
Now, for your real data, you have six ordinal predictor variables, which you encode as factors (something I wouldn't do). You can interpret all the variables in the same way. Since you use factors, you probably would't say "unit change" but use something like "category switch". For example:
This is very similar to what you say in your comment, only somewhat clearer for my taste.
Now, let me offer you unsolicited advice: When you use factors in regression, R will automatically convert them into unrelated binary dummy variables. For ordinal variables this is not quite correct. You cannot be at the same time "neutral" and "satisfied". Being "satisfied" implies being already more than "neutral".
Normally you'd expect the coefficient for "satisfied" to be larger than for "neutral", but in your model this is not always the case. See for example "fairly": Being "neutral" towards it (as opposed to "dissatisfied") increases the odds of "overall" by $\exp(1.0702)$, but being "satisfied" (again, as opposed to "dissatisfied"), only by $\exp(0.5618)$. This non-monotonous behaviour is likely to require an explanation and probably does not correctly reflect the real process in the population.
To avoid such problems, you could either:
The first approach has the advantage of keeping the number of predictors low, reducing the danger of overfitting and being more likely to find significant predictors. The disadvantage is that it implicitly assumes the same "step size" between "dissatisfied" and "neutral" as between "neutral" and "satisfied". For the second approach it's the other way round.