ROC Curve Interpretation – How to Understand Its Significance

classificationlogisticregressionroc

I applied logistic regression to my data on SAS and here are the ROC curve and classification table.

enter image description here

I am comfortable with the figures in the classification table, but not exactly sure what the roc curve and the area under it show. Any explanation would be greatly appreciated.

Best Answer

When you do logistic regression, you are given two classes coded as $1$ and $0$. Now, you compute probabilities that given some explanatory varialbes an individual belongs to the class coded as $1$. If you now choose a probability threshold and classify all individuals with a probability greater than this threshold as class $1$ and below as $0$, you will in the most cases make some errors because usually two groups cannot be discriminated perfectly. For this threshold you can now compute your errors and the so-called sensitivity and specificity. If you do this for many thresholds, you can construct a ROC curve by plotting sensitivity against 1-Specificity for many possible thresholds. The area under the curve comes in play if you want to compare different methods that try to discriminate between two classes, e. g. discriminant analysis or a probit model. You can construct the ROC curve for all these models and the one with the highest area under the curve can be seen as the best model.

If you need to get a deeper understanding, you can also read the answer of a different question regarding ROC curves by clicking here.