Solved – ROC accuracy measure error when no positive values

I have an unbalanced dataset that I am working to either up or down sample to balance out. But in the meantime I've run into an interesting error when calculating the AUC in R using pROC::auc when no positive predictions are made.

The test prediction resulted in zero positive cases. It predicted all 0's in my binary response. Does it make intuitive sense that the function throws an error for the AUC in this case? Or should it be zero or some other result. I know that balanced accuracy and other confusion matrix measures will still produce some number even when there are no positive predictions.

table(trainy)
trainy
    0     1 
10716  1181

table(testy)
testy
    0     1 
10646  1250 

cvfit <- glmnet(y=trainy, x=trainx, alpha=1, family="binomial")
  cv.glmmod <- cv.glmnet(x=trainx, y=train[,.b], alpha=1, nfolds=10)

  p <- predict(cvfit, newx = testx, type = "class", s = cv.glmmod$lambda.min)
  pred_class <- as.numeric(p)

all(pred_class == 0)
[1] TRUE
auc(pred_class, testy)
Error in roc.default(response, predictor, auc = TRUE, ...) : 
  No case observation.

Best Answer

ROC curves assume that the predictions are continuous values (such as predicted probabilities). You're using predictions of the form $\hat{y}=1$ or $\hat{y}=0$, which is nonsensical for ROC analysis because ROC analysis is about assessing the trade-off between true positives and false positives at different thresholds for an alarm.

This question is a very good introduction to the topic.

Best Answer

Related Solutions

Solved – Accuracy vs. area under the ROC curve

Related Question