I was comparing the performance of pROC and AUC libraries when performing auc() calculations on random data:
library("pROC");
library("AUC")
predictor <- rnorm(10000, 5);
outcome <- rnorm(10000) > 0;
print ("pROC:::auc() time & output")
system.time(x <- pROC:::auc(outcome, predictor))
print(x);
print ("AUC:::auc() time & output")
system.time(x <- AUC:::auc(AUC:::roc(predictor, factor(outcome))))
print(x);
AUC:::auc seemed to perform substantially faster, but what I found strange is that the compute different auc values for the same dataset:
> system.time(x <- pROC:::auc(outcome, predictor))
user system elapsed
1.00 0.01 1.31
> print(x);
Area under the curve: 0.5058
> print ("AUC:::auc() time & output")
[1] "AUC:::auc() time & output"
> system.time(x <- AUC:::auc(AUC:::roc(predictor, factor(outcome))))
user system elapsed
0.19 0.00 0.18
> print(x);
[1] 0.4942452
I thought the Auc() function was deterministic so they should produce the same number.
Yet pROC produces 0.5058 and AUC produces 0.4942452 .
Am I misusing either function?
EDIT: FYI I tried making the number semi random and the functions now give identical results (bar rounding errors):
predictor <- runif(10000);
outcome <- as.integer((predictor + runif(10000)) > 0.5);
Best Answer
First, notice that $1 - 0.4942452 = 0.5057548 ≈ 0.5058$.
Second, notice that
?AUC::roc
accepts only labels as {0,1}, with predictions being the probability of the positive event.pROC::roc
makes no such assumptions on the input data. It has some heuristics to detect which level is the positive and negative case, and more importantly if value of the predictions score are higher in the positive or negative cases. They don't have to be probabilities, any numeric values or even ordered factors will do. This can be controlled by thelevels
anddirection
arguments:pROC
auto-detects the direction, based on the median values in the two groups that are being compared.Please note that when you sample curves with AUC close to 0.5, this will result in a bias towards a higher AUC; therefore you should keep the direction fixed. To get results identical to
AUC::roc
you should use the following: