Solved – Area under ROC curve for random forest

auccaretclassificationrandom forestrocr

Does the area under ROC curve depends on which class is defined as default positive class by the random forest model?

I am using caret package in R to train and validate a random forest model.

library("ROCR")
library(caret)
rfmodel=train(x,y,method="rf",trainControl=ctrl,
 ntree=500,tuneGrid=data.frame(mtry=c(2,3,4)))
print(rfmodel)
predict.rf=predict(rfmodel,testdata,type="prob")

Now predict.rf has two columns representing probability for class 0 and class 1 respectively, which of this column should be used to calculate area under ROC curve.

In current case, By default the tpr is defined by taking class 0 as the positive class. As I understand The ROC curve is a plot between tpr and fpr. Does the ROC curve and AUC change if I define the positive class as 1 and accordingly tpr and fpr will be swapped?

Best Answer

Yes, but it is not relevant in practice, except some very rare cases when class order is somewhat not equivalent to the model (like in one-class SVM).

Exchanging class order simply changes AUROC from $a$ to $1-a$, so anyway your model makes so much sense as AUROC is far from .5. This way it is basically safe to report $1-a$ when $a<.5$, and many AUROC implementations will do this automatically.

Best Answer

Related Solutions

Imbalanced Data – Area Under the ROC Curve vs PR Curve for Imbalanced Data

Solved – Did I just invent a Bayesian method for analysis of ROC curves

Related Question