Solved – How tonterpret AUROC score

aucclassificationmachine learningrocunbalanced-classes

My model has an AUROC value of 0.7, and I have a 75:25 class (75% negative, 25% positive) imbalance. From my understanding, AUROC is calculated by using different thresholds for considering the prediction probability as positive. I was wondering if the interpretation of the AUROC score is affected by imbalanced classes (ie. would I interpret it differently if my data was split 50-50)? Essentially, what (if anything) can I say about my model's performance?

Also, I do not not fully understand the straight line on the AUROC curve that represents the random classifier. How do we know that this is a random classifier? By random, does this mean a classifier that essentially guesses and predicts the positive class with 0.5 probability and the negative class with 0.5 probability?

Best Answer

ROC curves are insensitive to class imbalance. This means is that the ROC curve will look the same when you change the class proportion of your dataset (besides statistical uncertainty of course).

That's because:

  • sensitivity is calculated from the number of true positives and false negatives, which is the number of actually positive observations only,
  • specificity is calculated from the number of false positives and true negatives, which is the number of actually negative observations.

Therefore:

  • If you change the fraction of positive observations, you will change both true positives and false negatives in the same proportion, and sensitivity will stay the same.
  • Similarly, if you change the fraction of negative observations, you will change both false positives and true negatives in the same proportion, and specificity will stay the same.

You can at the explanation of the confusion matrix on Wikipedia which should make all this clearer.

Now the ROC curve may be unaffected, but this is not the only way to measure a model's performance. Predictive values are affected by class imbalance. For instance the positive predictive value, that is given a positive prediction, what's the chance that the observation was actually positive; and negative predictive values, that is given a negative prediction, what's the chance that the observation was actually negative? These values are important when you want to apply your model to make decisions on unknown data. So you should make sure to calculate them on a dataset that is representative of the population on which the model will make the predictions eventually.

Related Question