Solved – Coefficients for every group in ordered logistic regression (polr) in R

logisticordered-logitrregression

When running an ordered logistic regression using the polr function of the MASSpackage (DV is low, medium, high) and have a look at the summary I get βs for every IV and the intercepts for low|medium and medium|high.

The predictfunction for assessing the probabilities (type='p') or the classes (type='class') also works just fine.

However I want to calculate the probabilities myself in order to use them with different data sets.

If I use the following code for a logistic model with a binary (!) dependent variable, I can exactly replicate the predict – outcome:

log_pred <- (logit_model$coefficients[1] + logit_model$coefficients[2]*IV_1 + logit_model$coefficients[3]*IV_2)

  • logit_model is my glm-object
  • logit_model$zeta[1] is the first intercept
  • logit_model$zeta[2] is the second intercept
  • logit_model$coefficients[1] is the β of IV_1
  • logit_model$coefficients[2] is the β of IV_2

the only thing I have to do now, to get the predicted probabilities is:

log_pred_probs <- exp(log_pred)/(1+exp(log_pred))

If I understand all the posts on ordered logistic regression I read correctly, the only thing I have to change with a polr object with the 3 "groups" of low, medium, and high would be to:

  • run the log-predpart for each group using their own intercepts, let's call them log_pred1 and log_pred2
  • and to, then, run the following code (similar to the logistic model above):
    log_pred_probs1 <- exp(log_pred1)/(1+exp(log_pred1)+exp(log_pred2)) for "low"
    log_pred_probs2 <- exp(log_pred2)/(1+exp(log_pred1)+exp(log_pred2)) for "medium"
    log_pred_probs3 <- 1/(1+exp(log_pred1)+exp(log_pred2)) for "high"

I think there are at least two problems ('cause this doesn't work at all):

  1. I need the β-coefficients for every level of the dependent variable, and summary(polr-object)does only show the βs for the first group (so does $coefficients)
  2. and I am not sure about the computation of the predicted probabilities for group 3, "high".

So these are the questions in short: How do I assess the β-coefficients for every level of the DV in a polrobject?

And

How do I compute the predicted probabilities for every level of the DV myself?

Best Answer

Alright, you can hit me now - after a couple of days of thinking I figured the answer out myself - but in any case that someone should have the same problem, please continue reading:

The answer is very (!) simple. Since this method is called proportional odds logistic regression, the coefficients are of course the same for every level of the dependent variable. And you get two thresholds for three DV levels (thresholds are the negative intercept, thus you'll notice a minus-sign below)

You just do this:

log_pred_probs1 <- 1-(exp(-logit_model$zeta[1] +
logit_model$coefficients[1] * IV_1 + logit_model$coefficients[2] * IV_2)/
(exp(-logit_model$zeta[1] + logit_model$coefficients[1] * IV_1 + 
logit_model$coefficients[2] * IV_2))

log_pred_probs2 <- 1-(exp(-logit_model$zeta[2] +
logit_model$coefficients[1] * IV_1 + logit_model$coefficients[2] * IV_2)/
(exp(-logit_model$zeta[2] + logit_model$coefficients[1] * IV_1 + 
logit_model$coefficients[2] * IV_2)) - log_pred_probs1

notice the two intercepts AND the subtraction of the first level's probability!

and finally

log_pred_probs3 <- 1 - log_pred_probs2 - log_pred_probs1

It's easy as that. Cheers!

Related Question