Solved – Average multiple precision recall curves

curvesprecision-recallpresentation

I have a binary classifier that I evaluate using cross validation. The classifier gives me a score for each observation and for each of the test sets I use a varying threshold to compute precision and recall values. Similar as to described in Threshold in precision/recall curve.

I can plot the precision recall values for each set as a curve by connection the entries/points with lines, but how would I average the different precision recall curves into one?

My first idea would be to use a for-loop to go through all recall values and try to (sub)sample precision values, but that seems to be very un-elegant.

How do you do it the right way? enter image description here

Best Answer

I just ended up using an 'engineering' approach and used an optimized function for interpolation.

In my case: http://docs.scipy.org/doc/numpy/reference/generated/numpy.interp.html

Related Solutions

Solved – Precision recall curve for nearest neighbor classifier

The precision/recall curve for KNN classifier consists of two points effectively (since KNN predicts binary values) so such curve is not very useful or meaningful. One could instead use the fraction of a given class in the neighborhood (i.e. non-smoothed density estimate; requires K > 1) as the model prediction which would make the precision/recall curve more meaningful.

As for leveraging distance, one could try to use "distance-weighted voting" by relying on a distance-based kernel to get the weights (as one of the simplest kernel density methods) but that would not really qualify as KNN classifier since since KNN's kernel is just a step function.

Solved – Threshold in precision/recall curve

Short answer: Torgo describes the usual method of generating such curves.

You can choose your threshold (= cut-off limit in the cited text) at any value. The cited text refers to one such choice as a working point.
That is, for a given working point, you'll observe exactly one (precision; recall) pair, i.e. one point in your graph. The precision-recall-curve is obtained by varying the threshold over the whole range of the classifier's continuous output ("scores", posterior probabilities, "votes") thus generating a curve from many working points.

Edit with respect to the comment:

I think "varying the threshold" is the usual way to explain or define the curve.

For the calculation, it is more efficient to sort the scores, and then see how precision and recall change when adding the next case: precision and recall can only change when the change in the threshold is large enough to cover the next score.

Consider this example:

case   true class   predicted score (high => class B)
1      A            0.2
3      B            0.5
2      A            0.6
4      B            0.9

threshold      recall    precision
> 0.9          N/A       0.0
(0.6, 0.9]     0.5       1.0        
(0.5, 0.6]     0.5       0.5
(0.2, 0.5]     1.0       0.67
< 0.2          1.0       0.5

That is, the precision-recall-curve acutally consists of points. It jumps from one point to the next when the threshold "crosses" an acutally predicted score. A smooth curve will result only for large numbers of test cases.

Best Answer

Related Solutions

Solved – Precision recall curve for nearest neighbor classifier

Solved – Threshold in precision/recall curve

Related Question