ROC Curve – How to Build with Only Labels and Predictions

anomaly detectiondata visualizationpythonroctime series

I built different models to approach the anomaly detection problem and I'd like to plot a ROC curve to see how do they perform on my datasets. Both the models are unsupervised neural networks and they give me as output the predicted labels for each point (it can be = 0 if it is not an anomaly, 1 if it is). I have the correct labels and I can manually calculate parameters such as precision, recall etc.

The problem is that when I give these parameters to the roc plot dedicated method, it obviously doesn't work since it requires a probability. How can I solve this problem? Are there valid alternatives to the ROC curve? I'll leave my (wrong) code so you can check what I tried to do:

import matplotlib.pyplot as plt
import sklearn.metrics as metrics

y_test = [1, 1, 1, 1, 1, 1, 1, 1]
preds = [1, 1, 1, 1, 1, 1, 1, 1]
fpr, tpr, threshold = metrics.roc_curve(y_test, preds)
roc_auc = metrics.auc(fpr, tpr)
plt.title('Receiver Operating Characteristic')
plt.plot(fpr, tpr, 'b', label = 'AUC = %0.2f' % roc_auc)
plt.legend(loc = 'lower right')
plt.plot([0, 1], [0, 1],'r--')
plt.xlim([0, 1])
plt.ylim([0, 1])
plt.ylabel('True Positive Rate')
plt.xlabel('False Positive Rate')
plt.show()

When I run it I get an empty plot and the following warning:

/opt/anaconda3/lib/python3.8/site-packages/sklearn/metrics/_ranking.py:941: UndefinedMetricWarning: No negative samples in y_true, false positive value should be meaningless warnings.warn("No negative samples in y_true, "

Best Answer

Building the ROC curve with sklearn.metrics requires to provide the score and not the final classification: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html Then the decision threshold on the score is varied and the resulting FPR and TPR as a parametric function of the threshold is reported.

Related Question