Solved – A standard name for a formula to “Maximize true positives while minimize false positives”

accuracyclassificationmachine learningoptimization

I am using an evaluation metric to reward the true positives and penalize the false positive ones retrieved by a function $f(\cdot)$.
Indeed, it can be represented as follows:
$\frac{\texttt{|TP|} – \texttt{|FP|}}{|\texttt{instances}|}$, where $\texttt{|TP|}$ and $\texttt{|FP|}$ are the number of true positives and false positives, respectively.

The goal is simple: selecting a function that maximizes $\texttt{TP}$ while minimizes $\texttt{FP}$.

I need to find a standard name for this formula to motivate its advantages for my work.
I am familiar with "sensitivity", "specificity", "F-measure", "recall", and "precision". However, none of them computes what I'm evaluating here.

Best Answer

I do not think this metric has an "official" name. For instance, it does not appear on the very comprehensive Wikipedia page on sensitivity and specificity, which also discusses many other related metrics.


I am a bit doubtful whether your proposed measure is really very useful. Suppose you have a completely random distribution of instances, with >50% positives and <50% negatives. Then you can maximize your criterion $f$ by classifying everything as "positive" - regardless of whether there are 51% true positives, or 60%, or 99.99%. Similarly, if there are <50% positives and >50% negatives, then you will maximize your function by classifying everything as "negative", again regardless of the actual prevalences. This incentive structure does not look very helpful to. (And of course, the same argument holds if these prevalences are conditional on predictors.)

This is very much related to the problems of straight-up accuracy as an evaluation measure, where the exact same problem comes up. I would recommend that you take a look at Why is accuracy not the best measure for assessing classification models? and think about how this applies to your measure.

Related Question