AUC Score – How to Generate Targets to Achieve a Given AUC Score from Classifier Scores

aucclassificationsimulation

Say I have output scores (maybe they're logits or poorly calibrated probabilities) from a trained classifier like y_hat = [1, 2, 3, 4], an expected positive rate of 25%, and I want to generate targets that satisfy the expected positive rate and give the scores an AUC of 66.6%. Then a vector of targets like y = [0, 0, 1, 0] would give me what I need, because

mean(y) = 0.25
auc(y, y_hat) = 0.666

Is there an efficient way to generate, or preferably sample, targets like this? In reality, my score arrays have hundreds of millions of rows.

Best Answer

Here you go:

library(pROC)

sample_auc <- function(prevalence,auc,n){
  truth <- rbinom(n=n,size=1,prob=prevalence)
  measure <- rep(NA,n)
  measure[truth==0] <- rnorm(n=n-sum(truth),mean=0,sd=sqrt(0.5))
  measure[truth==1] <- rnorm(n=sum(truth),mean=qnorm(p=auc),sd=sqrt(0.5))
  
  return(list("auc"=auc(roc(truth,measure)),
              "prevalence"=sum(truth)/n,
              "data"=data.frame(truth,measure)))
}
nsim=1e7
samples <- sample_auc(prevalence=0.25, auc=0.66, n=nsim)
samples$auc # 0.66
samples$prevalence # 0.25

truth <- samples$data[order(samples$data$measure),]$truth
your_sorted_measures <- 1:nsim 
# your_sorted_measures can be any sorted data, the following works too 
# your_measures <- sort(rgamma(n=nsim, shape=1,scale=1)) 
auc(roc(truth, your_sorted_measures)) # 0.66

I adjusted an old code of mine that produced random normal samples from a measurement with a prespecified AUC and prevalence. It should work with any kind of sorted data with no excessive amount of ties.

The general idea is the following:

  1. Generate cases and controls from a binomial distribution with prespecified prevalence.
  2. Assign measures to the cases and controls that are sampled from two normal distributions, where the difference between them is $N(\mu=\Phi^{-1}(\text{AUC}), \sigma^2=1)$, hence a random sample from the cases has a probability of $\text{AUC}$ of being larger than a random sample the controls.
  3. Since the $\text{AUC}$ does not care about the actual value but only about the ordering, we can apply the order of cases and controls to any ordered data set of measures.

Whether this is an efficient way to produce the samples depends on the perspective. 1e7 samples took around 20 seconds on my machine, so I guess hundreds of millions should be doable in a reasonable amount of time.

The pROC package is not really required, its only purpose is to check whether the AUC is actually how we want it.