Solved – What loss function should one use to get a high precision or high recall binary classifier

classificationlogisticloss-functionsunbalanced-classes

I'm trying to make a detector of objects that occur very rarely (in images), planning to use a CNN binary classifier applied in a sliding/resized window. I've constructed balanced 1:1 positive-negative training and test sets (is it a right thing to do in such case btw?), and classifier is doing fine on a test set in terms of accuracy. Now I want to control recall/precision of my classifier so, for example, it will not wrongly label too much of a majority class occurrences.

Obvious (for me) solution is to use same logistic loss which is used now, but weight type I and type II errors differently by multiplying loss in one of the two cases on some constant, which can be tuned. Is it right?

P.S. On a second thought this is equivalent to weighting some training samples more than the others. Just adding more of one class will achieve the same I think.

Best Answer

Artificially constructing a balanced training set is debatable, quite controversial actually. If you do it, you should empirically verify that it really works better than leaving the training set unbalanced. Artificially balancing the test-set is almost never a good idea. The test-set should represent new data points as they come in without labels. You expect them to be unbalanced, so you need to know if your model can handle an unbalanced test-set. (If you don't expect new records to be unbalanced, why are all your existing records unbalanced?)

Regarding your performance metric, you will always get what you ask. If accuracy is not what you need foremost in an unbalanced set, because not only the classes but also the misclassification costs are unbalanced, then don't use it. If you had used accuracy as metric and done all your model selection and hyperparameter tuning by always taking the one with the best accuracy, you are optimizing for accuracy.

I take the minority class as the positive class, this is the conventional way of naming them. Thus precision and recall as discussed below are precision and recall of the minority class.

  • If the only important thing is to identify all the minority class records, you could take recall. You are thus accepting more false positives.
  • Optimizing only precision would be a very weird idea. You would be telling your classifier that it's not a problem to underdetect the minority class. The easiest way to have a high precision is to be overcautious in declaring the minority class.
  • If you need precision and recall, you could take F-measure. It is the harmonic mean between precision and recall and thus penalizes outcomes where both metrics diverge.
  • If you know the concrete misclassification costs in both directions (and the profits of correct classification if they are different per class), you can put all that in a loss function and optimize it.
Related Question