Solved – Is it correct to use Precision-Recall AUC in a balanced dataset situation

data miningmachine learningprecision-recallroc

I have a binary classification scenario with a dataset that is unbalanced (much more negatives than positives). When I train a classifier on this dataset I get a Precision-Recall AUC of 0.7.

Then I under-sampled the dataset to make it balanced. Then I trained the classifier on this balanced dataset and I got a PR-AUC of 0.9.

My question: is it correct to use PR-AUC and not ROC-AUC here? Because as I know PR-AUC is highly influenced by the class-imbalance, and now I'm afraid that I got a high PR-AUC with the balanced dataset because of the connection between PR-AUC and class-imbalance.

In other words, did under-sampling the dataset bias the result? Or was it really that under-sampling the dataset was a good thing to increase the classification performance?

Best Answer

You can't compare PR-AUC values based on differently balanced data. You can use ROC-AUC for that, though, since that does not depend on class balance.

The larger the fraction of positives in the data set, the larger the area under the PR curve will be for a given model. By increasing the fraction of positives in the data, you artificially inflate PR-AUC (which may or may not be additional to an improved model, you cannot measure).

A random model has PR-AUC equal to the fraction of positives, since it's precision is always equal to the fraction of positives regardless of the recall. For ROC curves the AUC of a random model is 50%, independent of class balance. If you want to assess models under varying levels of class balance, I suggest using ROC-AUC instead of PR-AUC.

Related Solutions

Solved – Optimising for Precision-Recall curves under class imbalance

The ROC curve is insensitive to changes in class imbalance; see Fawcett (2004) "ROC Graphs: Notes and Practical Considerations for Researchers".
Up-sampling the low-frequency class is a reasonable approach.
There are many other ways of dealing with class imbalance. Boosting and bagging are two techniques that come to mind. This seems like a relevant recent study: Comparing Boosting and Bagging Techniques With Noisy and Imbalanced Data

P.S. Neat problem; I'd love to know how it turns out.

Solved – Recall and AUC of a binary classifier

ROC curves are false negative rate vs true positive rate graph. If you have AUC = 1, by definition you have perfect classifier.

From Information retrieval viewpoint ; if you have AUC = 1 then you have perfect recall and perfect precision. You recall all documents which exists about this topic, also all the documents you recall are relevant to your topic.

I would like to add more information for response to commenter.

Following is a graph from "ROC Graphs: Notes and Practical Considerations for Data Mining Researchers, Tom Fawcett"

Figure 2 of ROC Graphs

A discrete classifier is one that outputs only a class label. 
Each discrete classifier produces an (FP rate,TP rate) pair, 
which corresponds to a single point in ROC space. 
...
The point (0;1) represents perfect classification. 
D's performance is perfect as shown.

Best Answer

Related Solutions

Solved – Optimising for Precision-Recall curves under class imbalance

Solved – Recall and AUC of a binary classifier

Related Question