Model Selection – How to Choose Model Performance Metrics for Ordinal Response

model selectionordinal-datapredictive-modelsr

I'm interested in assessing model performance on data with an ordinal categorical dependent variable. For my use case, the ideal metric would:

  1. Not assume equal intervals between classes or that recoding to a continuous scale is appropriate
  2. Be scale independent
  3. Give preference to models that rank the outcomes accurately, with higher penalties for mis-ranking classes with a larger degree of difference (e.g., Excellent > Poor > Good is better than Excellent > Very Poor > Good)
  4. Accept continuous predictions and be indifferent to their distributions

For example, suppose we have the following test set, where "response" is 5-category ordinal response and "pred1", "pred2", and "pred3" are predictions:

id      response   pred1    pred2    pred3
 1     Excellent    1.00      150       10
 2          Good     .80       39        9
 3          Good     .85       12        5
 4          Fair     .40       11        4
 5          Poor     .39       10        3
 6     Very Poor     .20        3        2
 .             .       .        .        .
 .             .       .        .        .

For my purposes, the ideal metric would score all three predictions as equally accurate since all three perfectly rank the response.

What are my options and the benefits/drawbacks to each? Bonus points for references to R packages or functions.

Best Answer

A good measure is Somers' Dxy rank correlation, a generalization of ROC area for ordinal or continuous Y. It is computed for ordinal proportional odds regression in the lrm function in the rms package.

Related Question