Solved – Sample size calculation for ROC/AUC analysis

aucrocsample-sizestatistical-power

As a background, I am not familiar with stats except on a basic level. I have been tasked with doing some analysis that is out of my comfort zone.

I am trying to figure out how to compute necessary sample sizes for an ROC analysis based on desired statistical power. I have read Hanley and McNeil's paper (http://www.med.mcgill.ca/epidemiology/hanley/software/hanley_mcneil_radiology_82.pdf) and understand it fairly well. The relevant portion is the (over)estimate of standard error:
$$ SE(W) = \sqrt{\frac{A(1-A) + (n_a – 1)(Q_1 – A^2) + (n_n-1)(Q_2-A^2)}{n_a n_n}}$$
where $A$ is the expected AUC, $n_a$ is the number of abnormal samples, $n_n$ is the number of normal samples, and $Q_1 = \frac{A}{2-A}$, $Q_2 = \frac{2A^2}{1+A}$.

Using this, I can estimate confidence intervals fairly easily ($A\pm 1.96\cdot SE$ for $95\%$ confidence level). The part I am unsure about is how the extra parameter of statistical power comes into play. Let's say, for example, that I want a statistical power of $80\%$ when I have an expected $A$ value of 0.8. How would I go about finding suitable $n_a,n_n$ pairs?

Best Answer

Found the answer in the 2004 paper "ROC Curves in Clinical Chemistry: Uses, Misuses, and Possible Solutions" by Nancy A. Obuchowski, Michael L. Lieber, and Frank H. Wians, Jr. page 1123. Paper can be found here: http://www.clinchem.org/content/50/7/1118.full.pdf+html.

The formula is given for doing a one-sided test that the AUC is $>0.5$.

$$ n_D = \frac{\left(z_\alpha \sqrt{0.0792 \times (1+1/\kappa)} + > z_\beta \sqrt{V(\hat \theta)}\right)^2}{(\theta - 0.5)^2} $$ where $V(\hat \theta)$ is the variance function of $\hat \theta$, given by: $$ V(\hat \theta) = (0.0099 \times e^{-A^2/2}) \times (5A^2 + 8 + (A^2 + > 8)/\kappa), $$ i.e., the variance of $\hat \theta$ is equal to $V(\hat \theta)/n_D$; $ A = \Phi^{-1}(\theta) \times 1.414 $; $\Phi^{-1}$ is the inverse of the cumulative normal distribution function; $\kappa$ is the ratio of the number of control $n_C$ to diseased $n_D$ patients in the study sample (i.e., $\kappa = n_C/n_D$); $\theta$ is the conjectured area under the ROC curve (under the alternative hypothesis); $z_\alpha$ is the upper $\alpha$th percentile of the standard normal distribution, where $\alpha$ is the type I error rate (usually $\alpha = 0.05$); and $z_\beta$ is the upper $\beta$th percentile of the standard normal distribution, where $\beta$ is the type II error rate (often $\beta = 0.10$ or $0.20$).