Can one estimate and calculate a confidence interval for the value of a cutpoint obtained from the ROC curve?
for example, using the pROC
package in R:
> library(pROC)
> data(aSAH)
> roc1 <- roc(aSAH$outcome,
aSAH$s100b, percent=TRUE,
# arguments for ci
ci=TRUE, boot.n=100, ci.alpha=0.9, stratified=FALSE,
# arguments for plot
plot=TRUE, auc.polygon=TRUE, max.auc.polygon=TRUE, grid=TRUE,
print.auc=TRUE, show.thres=TRUE)
with confidence intervals:
> ci.thresholds(roc1)
will produce:
95% CI (2000 stratified bootstrap replicates):
thresholds sp.low sp.median sp.high se.low se.median se.high
-Inf 0.000 0.00 0.00 100.00 100.00 100.00
0.065 6.944 13.89 22.22 92.68 97.56 100.00
0.075 12.500 22.22 31.94 80.49 90.24 97.56
0.085 20.830 30.56 41.67 77.99 87.80 97.56
0.095 27.780 38.89 50.00 70.73 82.93 92.68
0.105 37.500 48.61 59.72 65.85 78.05 90.24
0.115 43.060 54.17 65.28 60.98 75.61 87.80
0.135 47.220 58.33 69.44 53.66 68.29 80.49
0.155 58.330 69.44 80.56 51.22 65.85 80.49
0.205 70.830 80.56 88.89 48.78 63.41 78.05
0.245 73.580 81.94 90.28 43.90 58.54 73.17
0.290 73.610 83.33 91.67 34.15 51.22 65.85
0.325 76.350 84.72 93.06 29.27 46.34 60.98
0.345 79.170 87.50 94.44 29.27 43.90 58.54
0.395 80.560 88.89 95.83 26.83 41.46 56.10
0.435 83.330 90.28 95.87 24.39 39.02 53.66
0.475 90.280 95.83 100.00 19.51 34.15 48.78
0.485 93.060 97.22 100.00 17.07 31.71 46.34
0.510 100.000 100.00 100.00 14.63 29.27 43.90
QUESTION
Why there is no CI on thresholds?
UPDATE
I realised how to specify the best
cutpoint to be not youden
, but topleft
?
rocobj <- plot.roc(aSAH$outcome,
aSAH$s100b,
main="Confidence intervals",
percent=TRUE, ci=TRUE, print.auc=TRUE)
# print the AUC (will contain the CI)
ciobj <- ci.se(rocobj,
specificities=seq(0, 100, 5))
plot(ciobj, type="shape", col="#1c61b6AA")
plot(ci(rocobj, of="thresholds", thresholds="best", best.method="topleft"))
Best Answer
You fixed the thresholds. They cannot vary in the boostrap.
Let's simplify and look at only one threshold:
By doing that, you fixed the threshold to 0.205 and asked: how much can my sensibility and specificity vary at that threshold? The threshold is your fixed point. It has no uncertainty associated with it. You implicitly did the same by asking for all thresholds, even though you didn't spell it out.
If you want a CI on the threshold you have to reformulate the question, for instance how uncertain is the "best" threshold?
Now of course you could ask: let's now fix the sensitivity at X = 0.9, and see how the threshold varies there. But this is more tricky than it sounds: you see, most of the ROC curve is actually a line between discrete points. Most points on the ROC curve actually fall between two thresholds. To calculate a threshold for an arbitrary point, you would need to interpolate a threshold value there. This is doable, but requires some parametric assumptions about the distribution of thresholds. Are you doing a linear interpolation? That would be pretty bad for most datasets I've ever seen.
So far, pROC is not able to reliably calculate a threshold at an arbitrary coordinate, so questions like how uncertain is the threshold at a fixed sensitivity X? result in errors or missing values. It's a can of worms I tried to open but I never reached anything useful.