Precision-Recall Curve – How to Form a Precision-Recall Curve with Single P-R Value?

computer visioncurvesdata miningMATLABprecision-recall

I have a data mining assignment where I make a content-based image retrieval system.
I have 20 images of 5 animals. So in total 100 images.

My system returns the 10 most relevant images to an input image. Now I need to evaluate the performance of my system with a Precision-Recall curve. However, I do not understand the concept of a Precision-Recall curve. Let's say my system returns 10 images for a gorilla image, but only 4 of them are gorillas. The other 6 images returned are other animals'. Thus,

  • precision is 4/10 = 0.4 (relevants returned) / (all returned)
  • recall is 4/20 = 0.2 (relevants returned) / (all relevants)

So I only have a point, <0.2,0.4>, not a curve. How do I have a curve (i.e., a set of points)? Should I change the number of images returned (this is fixed at 10 in my case)?

Best Answer

Generating a PR curve is similar to generating an ROC curve. To draw such plots you need a full ranking of the test set. To make this ranking, you need a classifier which outputs a decision value rather than a binary answer. The decision value is a measure of confidence in a prediction which we can use to rank all test instances. As an example, the decision values of logistic regression and SVM are a probability and a (signed) distance to the separating hyperplane, respectively.

If you dispose of decision values you define a set of thresholds on said decision values. These thresholds are different settings of a classifier: e.g. you can control the level of conservatism. For logistic regression, the default threshold would be $f(\mathbf{x}) = 0.5$ but you can go over the entire range of $(0, 1)$. Typically, the thresholds are chosen to be the unique decision values your model yielded for the test set.

At each choice of threshold, your model yields different predictions (e.g. different number of positive and negative predictions). As such, you get a set of tuples with different precision and recall at every threshold, e.g. a set of tuples $( T_i, P_i, R_i )$. The PR curve is drawn based on the $( P_i, R_i )$ pairs.

If I understood your comment correctly, the total similarity score you compute can be used as a decision value.

Related Question