Solved – Differences in AUC calculation between pROC and ROCR

aucmachine learningrroc

Does anyone know the difference in calculation between these two AUC packages? They get different results when I add in positives with predicted value of 0 (simulating a prob model where many outputs will be zero e.g. randomForest). So I would like to know why there is the kink in the pROC results (red) and the AUC 'flips'. The plot below illustrates this behaviour.

Also, this stems from the problem of having to compute AUC's on models with many positives assigned probabilities of zero.

 library(ROCR)
 library(pROC)

  start.ds <- data.frame(response=gl(2, 1000, labels = c("1", "0")),pred=ifelse(as.numeric(gl(2, 1000, labels = c("0", "1")))==2,0,1))
  add.positives <- data.frame(response=gl(1, 1000, labels = c("1")),pred=0)
  add.positives <- rbind(add.positives, add.positives)

  nadd <- nrow(add.positives)
  theSeq <- seq(1, nadd, 10)
  allAUCs <- NULL
  for(i in theSeq){
       print(i)
       temp.ds <- rbind(start.ds,add.positives[1:i,])
       temp.ds <- temp.ds[order(temp.ds$pred,decreasing=TRUE),]

       aucPROC <- as.numeric(auc(temp.ds[,1],temp.ds[,2]))
       aucAUCPackage <- performance(prediction(temp.ds[,2],temp.ds[,1]),"auc")@y.values[[1]]
       allAUCs <- rbind(allAUCs, data.frame(obs=i, aucPROC=aucPROC, aucAUCPackage=aucAUCPackage))

    }

  plot(allAUCs$aucPROC,col="red",type="l")
  lines(allAUCs$aucAUCPackage,col="black",type="l")

enter image description here

Best Answer

You never specified what is a positive result. Try the following:

aucPROC <- as.numeric(auc(temp.ds[,1],temp.ds[,2], direction="auto"))
aucPROC <- as.numeric(auc(temp.ds[,1],temp.ds[,2], direction="<"))
aucPROC <- as.numeric(auc(temp.ds[,1],temp.ds[,2], direction=">"))

Also you need to make sure that positive and negative examples are assigned to the correct values, although with 0s and 1s it will probably be detected correctly...

aucPROC <- as.numeric(auc(temp.ds[,1],temp.ds[,2], levels = c(0, 1), direction="<"))

See ?roc for more details.