Solved – Will ROC curve for a model always be symmetric if we have enough training data

aucmodel selectionpredictive-modelsroc

ROC curve usually looks like the following figure:

ROC curve

If we have enough data, could we safely assume that ROC curve for a model will always be symmetric around the line y = 100 – x? If not, is there any way to make the ROC curve shift up (the red line)?

I am asking this question because my final goal is to optimize sensitivity at a minimum specificity. However, using sensitivity as my evaluation metric seems to be very similar to using AUC. So, I am wondering whether using AUC is highly correlated with using sensitivity as an evaluation metric because you can not intentionally tweak the shape of ROC curve.

I also tried using sample weight, but it seems to only impact the cutoff of the classification model. Please check my attempts with SVM by changing the weight of the positive class.

Best Answer

No, the ROC curve does not need to be symmetric.

Take as an example a dataset with a binary dependent variable $y$ and a single binary independent variable $x$. Assume that $y=1$ whenever $x=1$ and that $y$ is randomly selected when $x=0$, taking value 1 with probability 0.5 and taking value 0 with probability 0.5. Assume $x=1$ for 1/3 of the observations and $x=0$ for 2/3 of the observations.

No matter how much data you have, any sensible model will have an ROC curve like the following:

enter image description here

Basically you can identify half of the positives with no errors using the observations with $x=1$, and the rest of the selection is more or less random. The ROC curve from your model will not be symmetric regardless of the size of the training set.