Solved – Which performance measure for unbalanced binary classification without an ‘active’ class

classificationmachine learningmodel-evaluationunbalanced-classes

My datasets have two classes A and B. The classes should be treated equally (there is no "active/inactive"). The datasets are unbalanced, sometimes A is more frequent, sometimes B is more frequent. Which performance measure should I use?

Accuracy makes no sense on unbalanced datasets. If I get it right, F-measure and AUC assume that there is a active class: F-measure ignores true negatives as it is the harmonic mean of precision and recall. AUC ignores true negatives and false negatives.

So what performance measure should I use?
Is AUC(active=A) + AUC(active=B) / 2 a valid option?

CORRECTION:

Apparently, I missunderstood how AUC works. It does NOT ignore true negatives and false negatives. The ROC curves look different depending on which class is considered the active one, but AUC(active=A) = AUC(active=B).

Best Answer

Have a look at the Matthews Correlation Coefficient

$$MCC = \frac{TP \cdot TN - FP \cdot FN}{\sqrt{ (TP + FP)(TP + FN)(TN + FP)(TN + FN) }}$$

I have seen it pretty often as performance metrix in classification of SNPs dataset. Have a look at this link as well, they discuss the difference between AUC and MCC

Otherwise you can just compute an average accuracy (average error rate), I have seen people using it in multiclass problems as well.

$$AAcc = \frac{1}{2} \bigg( \frac{TP}{TP + FN} + \frac{TN}{TN + FP} \bigg) $$

Usually it is used in authentication systems under the form of Half Total Error Rate. E.g. here they provided a statistical test for that.

Related Question