First off, are your two independent variables being adjusted as factors or numerically coded responses and is there an interaction term for the two? The reason I ask is because the test of proportional odds grows very sensitive with small cell counts. For this reason, I often find it justifiable to adjust input variables as their ordinally coded values (1: poor, 2: fair-to-poor, etc.). Doing so allows information to be borrowed across groups, proportionality is assessed so that an associated difference in the odds of a more favorable response comparing units differing by 1 in the predictor are consistent with odds of an even more favorable response (the rough and contrived interpretation of the test of proportional odds).
If your numeric coding still fails to give valid proportionality, it is possible to get consistent cumulative odds ratios estimates by collapsing adjacent categories like the two bottom box responses.
Thirdly, another powered test of association between an ordinal response and two ordinal factors is a plain old linear regression model. Using robust standard errors, you get valid confidence intervals despite the distribution of the errors. This tends to be less powerful that categorical methods, but with fewer pitfalls due to zero cell counts.
Lastly, as a comment, robust standard errors allow consistent estimation of the mean model in most circumstances. I'm not sure if these are implemented in SPSS, but R and SAS use these frequently. As with the proportional hazards assumption in the Cox model, when this "model based assumption check" fails, it does not mean the model results are entirely invalid, it's just that the effect estimates are "averaged" over their inconsistent proportionality. For instance, if proportional odds model has excessive numbers of respondents giving top box responses, and a predictor shows a large association for the top box response but smaller association for other cumulative measures, then you'll find that the cumulative odds ratio is a weighted combination of the several thresholded odds ratios, with a higher weight placed upon the top box OR.
That is a huge association. It goes from basically everybody below to everybody above who attend.
Fitting the model:
att <- c(0,1,0,1,0,1)
exam <- factor(c(0,0,1,1,2,2))
w <- c(1482, 300, 1094, 2822, 57, 1422)
f <- polr( exam ~ att, weights=w)
gives
Call:
polr(formula = exam ~ att, weights = w)
Coefficients:
att
2.925251
Intercepts:
0|1 1|2
0.2565983 3.7156750
Residual Deviance: 11686.09
AIC: 11692.09
As noted an OR of 18 ($\approx(exp(3))$).
Typing summary
for the model gives one way of doing inference:
Call:
polr(formula = exam ~ att, weights = w)
Coefficients:
Value Std. Error t value
att 2.925 0.06634 44.1
Intercepts:
Value Std. Error t value
0|1 0.2566 0.0390 6.5819
1|2 3.7157 0.0667 55.7015
Residual Deviance: 11686.09
AIC: 11692.09
the two sided Wald $p$-value is: 2*pt(44.1, df=3, lower.tail=F) = 0.000025
. As you note, MASS does not calculate $p$-values this way because the "intercept(s)" terms do not have the same mathematical properties as the intercept in a logistic model, so you don't know what their distribution and standard error might be if the null hypothesis was true. Fitting the reduced model and testing the output with a LRT is the way to overcome this.
If I fit intercept only:
i <- polr( exam ~ 1, weights=w)
then anova(f, i)
is:
Likelihood ratio tests of ordinal regression models
Response: exam
Model Resid. df Resid. Dev Test Df LR stat. Pr(Chi)
1 1 7175 14382.09
2 att 7174 11686.09 1 vs 2 1 2695.993 0
As you note, huge association and huge test statistic had unsurprising result: a large association. The $p$-value of 0 is just a consequence of rounding. $p$-values are never exactly 0. Reporting it out to 3 (or even 2) digits using $p < 0.01$ suffices, especially since significance testing is more concerned with meeting or exceeding alpha level than the actual precision of the $p$-value.
The interpretation of the coefficient is:
The odds of achieving a more desirable exam performance rating for a student that "attended" (clarifying beforehand how attended was defined) was 18 times higher than for a student that did not.
Since the categories are so few, you can also just summarize the predictions:
> round(predict(f, type = 'probs', newdata = data.frame(att=0:1)), 2)
0 1 2
1 0.56 0.41 0.02
2 0.06 0.62 0.31
You can say more than 50% were below average who did not attend, whereas only 6% were below average who did attend. And that only 2% were above expectation who did not attend versus 31% who were above expectation who did attend.
Another implementation of proportional odds that has more "off the shelf" functionality comes from Frank Harrel's rms
package, specifically the lrm
function. Fitting:
> lrm(exam ~ att, weights = w)
Logistic Regression Model
lrm(formula = exam ~ att, weights = w)
Sum of Weights by Response Category
0 1 2
1782 3916 1479
Model Likelihood Discrimination Rank Discrim.
Ratio Test Indexes Indexes
Obs 6 LR chi2 2695.99 R2 0.362 C 0.738
0 2 d.f. 1 g 1.755 Dxy 0.476
1 2 Pr(> chi2) <0.0001 gr 5.784 gamma 0.879
2 2 gp 0.299 tau-a 0.285
Sum of weights7177 Brier 0.129
max |deriv| 2e-13
Coef S.E. Wald Z Pr(>|Z|)
y>=1 -0.2566 0.0390 -6.58 <0.0001
y>=2 -3.7157 0.0667 -55.70 <0.0001
att 2.9253 0.0663 44.10 <0.0001
Gives the same Wald and LRT statistics that I calculated before.
Best Answer
I would interpret the contact coefficient as:
My problem with the authors' interpretation is two fold:
Having said that, when I've written papers employing cumulative link models, I've usually had to provide model predictions to make the results completely understandable to the lay audience. Sometimes the correct answer is to treat an ordinal outcome like a continuous response for the sake of easily understanding the output.