Solved – Is it correct to use Precision-Recall AUC in a balanced dataset situation

data miningmachine learningprecision-recallroc

I have a binary classification scenario with a dataset that is unbalanced (much more negatives than positives). When I train a classifier on this dataset I get a Precision-Recall AUC of 0.7.

Then I under-sampled the dataset to make it balanced. Then I trained the classifier on this balanced dataset and I got a PR-AUC of 0.9.

My question: is it correct to use PR-AUC and not ROC-AUC here? Because as I know PR-AUC is highly influenced by the class-imbalance, and now I'm afraid that I got a high PR-AUC with the balanced dataset because of the connection between PR-AUC and class-imbalance.

In other words, did under-sampling the dataset bias the result? Or was it really that under-sampling the dataset was a good thing to increase the classification performance?

Best Answer

You can't compare PR-AUC values based on differently balanced data. You can use ROC-AUC for that, though, since that does not depend on class balance.

The larger the fraction of positives in the data set, the larger the area under the PR curve will be for a given model. By increasing the fraction of positives in the data, you artificially inflate PR-AUC (which may or may not be additional to an improved model, you cannot measure).

A random model has PR-AUC equal to the fraction of positives, since it's precision is always equal to the fraction of positives regardless of the recall. For ROC curves the AUC of a random model is 50%, independent of class balance. If you want to assess models under varying levels of class balance, I suggest using ROC-AUC instead of PR-AUC.