Solved – How to generate ROC curves for leave-one-out cross validation

cross-validationroc

When performing 5-fold cross-validation (for example), it is typical to compute a separate ROC curve for each of the 5 folds and often times a mean ROC curve with std. dev. shown as curve thickness.

However, for LOO cross-validation, where there is only a single test datapoint in each fold, it doesn't seem sensical to compute a ROC "curve" for this single datapoint.

I have been taking all of my test data points (along with their separately computed p-values) and pooling them into one large set to compute a single ROC curve, but is this the statistically kosher thing to do?

What is the right way to apply ROC analysis when the number of data points in each fold is one (as in the case of LOO cross validation)?

Best Answer

If the classifier outputs probabilities, then combining all the test point outputs for a single ROC curve is appropriate. If not, then scale the output of the classifier in a manner that would make it directly comparable across classifiers. For example, say you are using Linear Discriminant Analysis. Train the classifier and then put the training data through the classifier. Learn two weights: a scale parameter $\sigma$ (the standard deviation of the classifier outputs, after subtracting the class means), and a shift parameter $\mu$ (the mean of the first class). Use these parameters to normalize the raw $r$ output of each LDA classifier via $n = (r-\mu)/\sigma$, and then you can create an ROC curve from the set of normalized outputs. This has the caveat that you are estimating more parameters, and thus the results may deviate slightly than if you'd constructed an ROC curve based on a separate test set.

If it is not possible to normalize classifier outputs or transform them to probabilities, then a ROC analysis based on LOO-CV is not appropriate.