Solved – How to interpret the area under the precision-recall curve

interpretationprecisionroc

I recently learned of precision-recall curves as an alternative to ROC curves for testing the performance of a binary classification model. However, I don't know how to interpret the meaning of the area under a precision-recall curve. Is it the "average precision"? What does that mean?

Best Answer

Yes, it is average precision, where the average is taken across different thresholds for saying "yes".

The precision-recall curve typically starts out relatively high, and descends though not monotonically. On the right edge, to guarantee perfect recall you just say "yes" to everything, so precision will be down at the base rate. On the left, you require absolute certainty to say "yes", so you miss a lot, but hopefully everything you identify is a target.

Because of noise there will be fluctuations in the line.

If the base rate is low, it's possible that a model has a high area under the ROC curve but still a low area under the PR curve. For example, Andy Berger notes this is the case for conflict studies, and provides some example graphs.