Threshold – What is an F1 Optimal Threshold and How to Calculate It?

threshold

I've used h2o.glm() function in R which gives a contingency table in the result along with other statistics. The contingency table is headed "Cross Tab based on F1 Optimal Threshold"

Wikipedia defines F1 Score or F Score as the harmonic mean of precision and recall. But aren't Precision and Recall found only when the result of predicted values of a logistic regression(for example) is transformed to binary using a cutoff.

Now by cutoff I remember, what is the connection between F1 Score and Optimal Threshold. How is optimal threshold calculated? How is F1 optimal threshold calculated?

Sorry if I've missed something, I'm new to stats here.

Best Answer

I actually wrote my first paper in machine learning on this topic. In it, we identified that when your classifier outputs calibrated probabilities (as they should for logistic regression) the optimal threshold is approximately 1/2 the F1 score that it achieves. This gives you some intuition. The optimal threshold will never be more than .5. If your F1 is .5 and the threshold is .5, then you should expect to improve F1 by lowering the threshold. On the other hand, if the F1 were .5 and the threshold were .1, you should probably increase the threshold to improve F1.

The paper with all details and a discussion of why F1 may or may not be a good measure to optimize (in both single and multilabel case) can be found here:

https://arxiv.org/abs/1402.1892

Sorry that it took 9 months for this post to come to my attention. Hope that you still find the information useful!

Related Question