Solved – Interpreting odds ratios in ordinal logistic regression

odds-ratioordered-logitordinal-data

I am trying to predict exam performance (below, average, above) based on whether participants attended a revision class. I am analysing my data in R using a proportional odds logistic regression. I am using the polr function from the MASS package.

Here is an example of my code:

data$exam_performance <- factor(data$exam_performance, c("Below", "Average", "Above"))

data$attended <- factor(data$attended, c("0", "1"))

model <- polr(formula = exam_performance ~ attended, data = data, Hess = TRUE)

exp(coef(model))

This returns an odds ratio of 18.64. Given my factor orderings, does this tell me that 'the odds of achieving average or above average performance is 18.64 times more if the participant attended than if they did not'?

Also, I notice that MASS does not provide p-values. I've googled this and it seems that LRT is the best way to obtain these. Would this be how I would do it?

intercept_only <- polr(formula = exam_performance ~ 1, data = data, Hess = TRUE)

anova(model, intercept_only)

This results in a Pr(Chi) value of 0. Should this be reported as p < .001?

The table of the factors is:

             Attended 
                                    0    1
Exam performance 
                           Below 1482  300 
                        Expected 1094 2822 
                           Above   57 1422

Best Answer

That is a huge association. It goes from basically everybody below to everybody above who attend.

Fitting the model:

att <- c(0,1,0,1,0,1)
exam <- factor(c(0,0,1,1,2,2))
w <- c(1482, 300, 1094, 2822, 57, 1422)
f <- polr( exam ~ att, weights=w)

gives

Call:
polr(formula = exam ~ att, weights = w)

Coefficients:
     att 
2.925251 

Intercepts:
      0|1       1|2 
0.2565983 3.7156750 

Residual Deviance: 11686.09 
AIC: 11692.09

As noted an OR of 18 ($\approx(exp(3))$).

Typing summary for the model gives one way of doing inference:

Call:
polr(formula = exam ~ att, weights = w)

Coefficients:
    Value Std. Error t value
att 2.925    0.06634    44.1

Intercepts:
    Value   Std. Error t value
0|1  0.2566  0.0390     6.5819
1|2  3.7157  0.0667    55.7015

Residual Deviance: 11686.09 
AIC: 11692.09

the two sided Wald $p$-value is: 2*pt(44.1, df=3, lower.tail=F) = 0.000025. As you note, MASS does not calculate $p$-values this way because the "intercept(s)" terms do not have the same mathematical properties as the intercept in a logistic model, so you don't know what their distribution and standard error might be if the null hypothesis was true. Fitting the reduced model and testing the output with a LRT is the way to overcome this.

If I fit intercept only:

i <- polr( exam ~ 1, weights=w)

then anova(f, i) is:

Likelihood ratio tests of ordinal regression models

Response: exam
  Model Resid. df Resid. Dev   Test    Df LR stat. Pr(Chi)
1     1      7175   14382.09                              
2   att      7174   11686.09 1 vs 2     1 2695.993       0

As you note, huge association and huge test statistic had unsurprising result: a large association. The $p$-value of 0 is just a consequence of rounding. $p$-values are never exactly 0. Reporting it out to 3 (or even 2) digits using $p < 0.01$ suffices, especially since significance testing is more concerned with meeting or exceeding alpha level than the actual precision of the $p$-value.

The interpretation of the coefficient is:

The odds of achieving a more desirable exam performance rating for a student that "attended" (clarifying beforehand how attended was defined) was 18 times higher than for a student that did not.

Since the categories are so few, you can also just summarize the predictions:

> round(predict(f, type = 'probs', newdata = data.frame(att=0:1)), 2)
     0    1    2
1 0.56 0.41 0.02
2 0.06 0.62 0.31

You can say more than 50% were below average who did not attend, whereas only 6% were below average who did attend. And that only 2% were above expectation who did not attend versus 31% who were above expectation who did attend.

Another implementation of proportional odds that has more "off the shelf" functionality comes from Frank Harrel's rms package, specifically the lrm function. Fitting:

> lrm(exam ~ att, weights = w)
Logistic Regression Model

 lrm(formula = exam ~ att, weights = w)


 Sum of Weights by Response Category

    0    1    2 
 1782 3916 1479 

                        Model Likelihood     Discrimination    Rank Discrim.    
                           Ratio Test           Indexes           Indexes       
 Obs             6     LR chi2    2695.99    R2       0.362    C       0.738    
  0              2     d.f.             1    g        1.755    Dxy     0.476    
  1              2     Pr(> chi2) <0.0001    gr       5.784    gamma   0.879    
  2              2                           gp       0.299    tau-a   0.285    
 Sum of weights7177                          Brier    0.129                     
 max |deriv| 2e-13                                                              

      Coef    S.E.   Wald Z Pr(>|Z|)
 y>=1 -0.2566 0.0390  -6.58 <0.0001 
 y>=2 -3.7157 0.0667 -55.70 <0.0001 
 att   2.9253 0.0663  44.10 <0.0001

Gives the same Wald and LRT statistics that I calculated before.

Related Solutions

Solved – Interpreting odds ratios

The fact that these are coefficients are represented entirely by factors in R means that the Intercept is the log-odds for the event, i.e log(the proportion with event / proportion without) for subjects who all have their factor values at the lowest level. We know that of the 1615 in level 1 of the factor under scrutiny, 1088 survived, although 1088/(1615-1088) (= odds of survival given factor =1) is not necessarily going to match up with exp(Intercept) since not all of those people also had the other factors at the lowest level. In fact the odds of survival (or it could be death, since it's not really been made clear what the coding for the events was) was quite different. At the best case of all factors being = 1 the odds of survival were:

exp( 4.2284770 )
[1] 68.61266

That's actually a pretty low odds for newborn survival, so this must have been a NICU study or something happening in a third world county. But its way higher than the odds for all children who had that value and any other values for the other predictors. (When looking at the single line of data it was neither ... it was on a different species.) In R the way to get a quick estimate of survival probability would be:

 (pred1s <- predict(mod, newdata=as.data.frame( with( lesna, 
   list( seas = 1 , 
        btw=1 , prectem5 =1 , 
        pcscore =1 , pindx5 =1 ,
        presp2= 1 ,   ppscore=  1,
        mtone2 = , fos =  , 
        psex =  1 , pscolor = 1, 
        pshiv =1 ,    backfat5= 1 , 
        srect2 =levels(srect2)[1] , gest3 =1 , 
        int3= 1 , agit= 1 , tacc =1 ) )
                               ), 
      type="response"  )
)        # the outer parens are to get a value to print when the assignment is made

You can the compare exp(Intercept) to that ( atypical ) prediction divided by (1-prediction) ... since that what odds are. So I hope it's becoming clear. You need to specify all of the factor level values at once to get a prediction. Or you need to create a synthetic cohort with a specific composition of all factor levels. You cannot take a single factor distribution and create a prediction from such a complex model unless you specify some sort of average value for all the other factors.

Edit after looking at the single line of data:

Some (most in fact) of those variable were not factors, (and I guess I should have recognized that), which means they do not have any levels in your dataframe, but do have levels in the model matrix. I had assumed that the levels would be attributes of the factor variable, but I was wrong. It's going to be easier to work on this model if the structure of your 'newdata' arguments given to predict have the same structure as the original dataframe and I would seriously consider making a copy and turning all those items into factors. But with the exception of "srect2" we can change all of those items to 1. at least under the assumption that that is the minimum value for each of those variables. If it's not, ... then you need to use the minimum value. Code edited.

response value calculated:

1.4418 is the log-odds ( the Intercept in the linear predictor fpr the baseline category)

odds = Pr(X=1)/(1-Pr(X=1)) :: definition

log(odds) = log(Pr(X=1)/(1-Pr(X=1)) ) = 1.4418 :: starting point

Solve for Pr(X=1) ... should be = the calculated "response" value.

exp(1.4418)*(1- Pr(X=1) ) = Pr(X=1)

exp(1.4418) = (1+exp(1.4418))* Pr(X=1)

Pr(X=1) = exp(1.4418) / (1+exp(1.4418))

> exp(1.4418) / (1+exp(1.4418))
[1] 0.8087332

All of the other levels need to have the Intercept added to get their correct linear predictors. The difference in log-odds, i.e. the coefficients, is directly equivalent to the ratio on the odds scale, hence the exp(coef) is a bunch of odds ratios.

{ log(x-y) = c  }    <=>      { x/y = exp(c) }

Solved – Odds and odds ratios in logistic regression

The odds is not the same as the probability. The odds is the number of "successes" (deaths) per "failure" (continue to live), while the probability is the proportion of "successes". I find it instructive to compare how one would estimate these two: An estimate of the odds would be the ratio of the number of successes over the number of failures, while an estimate of the probability would be the ratio of the number of success over the total number of observations.

Odds and probabilities are both ways of quantifying how likely an event is, so it is not surprising that there is a one to one relation between the two. You can turn a probability ($p$) into an odds ($o$) using the following formula: $o=\frac{p}{1-p}$. You can turn an odds into a probability like so: $p = \frac{o}{1+o}$.

So to come back to your example:

The baseline probability is .5, so you would expect to find 1 failure per success, i.e. the baseline odds is 1. This odds is multiplied by a factor 5.8, so then the odds would become 5.8, which you can transform back to a probability as: $\frac{5.8}{1+5.8}\approx.85$ or 85%
A two degree change in temperature is association with a change in the odds of death by a factor $5.8^2=33.6$. So the baseline odds is still 1, which means the new odds would be 33.6, i.e. you would expect 33.6 dead fishes for every live fish, or the probability of finding a dead fish is $\frac{33.6}{1+33.6} \approx .97$
A three degree change in temperatue leads to a new odds of death of $1\times 5.8^3\approx195$. So the probability of finding a dead fish = $\frac{195}{1+195}\approx.99$

Best Answer

Related Solutions

Solved – Interpreting odds ratios

Edit after looking at the single line of data:

Solved – Odds and odds ratios in logistic regression

Related Question