Solved – In R how to compute the p-value for area under ROC

p-valuerroc

I struggle to find a way to compute the p-value for the area under a receiver operator characteristic (ROC). I have a continuous variable and a diagnostic test result. I want to see if AUROC is statistically significant.

I found many packages dealing with ROC curves: pROC, ROCR, caTools, verification, Epi. But even after many hours spent reading the documentation and testing, I couldn't find how. I think I've just missed it.

Best Answer

In your situation it would be fine to plot a ROC curve, and to calculate the area under that curve, but this should be thought of as supplemental to your main analysis, rather than the primary analysis itself. Instead, you want to fit a logistic regression model.

The logistic regression model will come standard with a test of the model as a whole. (Actually, since you have only one variable, that p-value will be the same as the p-value for your test result variable.) That p-value is the one you are after. The model will allow you to calculate the predicted probability of an observation being diseased. A Receiver Operating Characteristic tells you how the sensitivity and specificity will trade off, if you use different thresholds to convert the predicted probability into a predicted classification. Since the predicted probability will be a function of your test result variable, it is also telling you how they trade off if you use different test result values as your threshold.


If you are not terribly familiar with logistic regression there are some resources available on the internet (besides the Wikipedia page linked above):