Solved – Area between the ROC curve and the Random Guessing Line

aucclassificationmachine learningroc

How close is my classifier to random guessing?

I need to quantify the inability of a binary classifier to obtain better results than random guessing in a single number evaluation metric.

The random guessing line (RGL) from (0,0) to (1,1) has an AUC of 0.5. But so does the blue curve (grey area).

Wouldn't it be more suitable to use the area between the RGL and the ROC-curve to estimate how "close" a classifier is to actual random guessing?

Best Answer

First, the 0.5 random guess line is just a visual reference, what we really want to know if how well the classifier performs overall. I'm also not sure how your proposed method would provide a different result, assuming that you subtract first section that is below the 0.5 line.

Second, the AUC also has a nice statistical property where it is equivalent to the Wilcoxon-Mann-Whitney U Test statistic. This is the probability that the classifier will rank a randomly chosen positive instance higher than a randomly chosen negative instance.

In this case, we don't need to see the AUC value to know this is a poor classifier (and one unlikely to be seen in real life, as it implies that true positive values are consistently rated under a certain probability threshold while false positive values are more evenly distributed).

Related Solutions

Solved – pattern of ROC curve and choice of AUC

I agree with your concerns.

given that people in reality will seldom choose a FPR cut-off of 0.5 or higher, why people would prefer a ROC curve with FPR ranging from 0 to 1 and use the full AUC value (i.e. calculate the entire area under the ROC curve) instead of just reporting the area made from, say, 0 to 0.25 or to 0.5? Is that called "partial AUC"?

I'm a big fan of having the complete ROC, as it gives much more information that just the sensitivity/specificity pair of one working point of a classifier.
For the same reason, I'm not a big fan of summarizing all that information even further into one single number. But if you have to do so, I agree that it is better to restrict the calculations to parts of the ROC that are relevant for the application.

in the figure below, what can we say about the performances of the three models? The AUC values are: green (0.805), red (0.815), blue (0.768). The red curve turns out to be superior, but as you see, the superiority is only reflected after FPR > 0.2. Thanks :)

That depends entirely on your application. In your example, if high specificity is needed, then the green classifier would be best. If high sensitivity is needed, go for the red one.

As to the comparison of classifiers: there are lots of questions and answers here discussing this. Summary:

classifier comparison is far more difficult than one would expect at first
not all classifier performance measures are good for this task. Read @FrankHarrells answers, and go for so-called proper scoring rules (e.g. Brier's score/mean squared error).

Solved – outlier detection: area under precision recall curve

The problem is with your example that it is possible to have zero $tp$ and zero $fp$, therefore the precision $prec = tp/(tp+fp)$ becomes undefined because we divide by zero. Because of this the PR curve only contains points for one $x$-value, and therefore the area under the PR curve becomes zero for your example.

You can see this by plotting the PR curve:

[X,Y,T,PR] = perfcurve(label,score,1, 'xCrit', 'reca', 'yCrit', 'prec') % PR = 0
figure
scatter(X,Y)
xlabel('recall')
ylabel('precision')

So plotting a PR curve doesn't really work well when all your scores are the same.

To gain more insights between the difference of the PR curve and the ROC curve, compare these two prediction lists. We consider the case where we predict all zeros, and predict one 1, but it should be zero (score1). This one doesnt work very well, it predicts 0 everywhere, except for one object where it predicts 1 where it should be zero. We consider another case, where we predict one 1 correctly, and the rest we classify as 0. Here we thus predict 1 one correctly, and the rest we classify as 0. We compare the area under the PR curve and the area under the ROC.

outlier = 1;
normal  = 0;
% 99% normal data 1% outlier
label = normal*ones(1000,1); 
label(1:10) = outlier;  

%label = real( rand(1000,1) > 0.99 );     % 99% normal data 1% outlier
score1 = [zeros(999,1);1]; % predict everything as zero, and one mistake 
score2 = [1;zeros(999,1)]; % predict everything as zero, and one 1 correct 

[X,Y,T,AUC1] = perfcurve(label,score1,1)
% AUC1 = 0.5
[X,Y,T,AUC2] = perfcurve(label,score2,1)
% AUC2 = 0.55

[X,Y,T,PR1] = perfcurve(label,score1,1, 'xCrit', 'reca', 'yCrit', 'prec') 
% PR1 = 0.005 
[X,Y,T,PR2] = perfcurve(label,score2,1, 'xCrit', 'reca', 'yCrit', 'prec') 
% PR2 = 0.4545

Observe that the AUC varies little between score1 and score2. However, the area under the PR curve is significantly different. It rewards score2 much more than score1. This indicates it is better suited to outlier detection: it rewards detecting the outlier much more than the AUC. In case of outlier detection you would prefer score2 much more, since it predicts the 1 that you want to detect correctly, while score1 predicts a 1 for a zero and never catches any outliers.

In general, the AUC is more informative to give an idea how well your predictions work for varying priors. Thus the AUC characterizes how well the classifier works for varying number of ones and zeros.

The PR curves indicates more well how it performs for the current class imbalance considered. Therefore the PR curve is more interesting for you: it takes into account there are little 1's in your dataset than 0's. Because you are only interested in this case when you are interested in outlier detection, the PR curve is more informative.

While the AUC characterizes how your predictions would do if there are much more 1's as well.

For more information see also:

https://www.quora.com/What-is-Precision-Recall-PR-curve

ROC vs precision-and-recall curves

Finally, you might be interested in how to compute an ROC / PR curve, a detailed explanation is given here for ROC curves:

http://blogs.sas.com/content/iml/2011/07/29/computing-an-roc-curve-from-basic-principles.html

Best Answer

Related Solutions

Solved – pattern of ROC curve and choice of AUC

Solved – outlier detection: area under precision recall curve

Related Question