Does the area under ROC curve
depends on which class is defined as default positive class by the random forest model?
I am using caret
package in R
to train and validate a random forest model.
library("ROCR")
library(caret)
rfmodel=train(x,y,method="rf",trainControl=ctrl,
ntree=500,tuneGrid=data.frame(mtry=c(2,3,4)))
print(rfmodel)
predict.rf=predict(rfmodel,testdata,type="prob")
Now predict.rf
has two columns representing probability for class 0 and class 1 respectively, which of this column should be used to calculate area under ROC curve.
In current case, By default the tpr
is defined by taking class 0 as the positive class. As I understand The ROC curve is a plot between tpr
and fpr
. Does the ROC
curve and AUC
change if I define the positive class as 1 and accordingly tpr
and fpr
will be swapped?
Best Answer
Yes, but it is not relevant in practice, except some very rare cases when class order is somewhat not equivalent to the model (like in one-class SVM).
Exchanging class order simply changes AUROC from $a$ to $1-a$, so anyway your model makes so much sense as AUROC is far from .5. This way it is basically safe to report $1-a$ when $a<.5$, and many AUROC implementations will do this automatically.