Survival Analysis – Calculating Type II Error in Survival Analysis

logrank-testsurvivaltype-i-and-ii-errors

I am a beginner and I have been looking all over the internet and books to find an answer to the folowing question (but haven't been able to find the answer):

I have observatoinal data and would like to compare the survival outcomes of these two groups using Kaplan-Meier curves and log-rank test. How do you calculate the probability for type II error for the comparison of survival outcomes between two groups?

I hope someone can help me out.

Best wishes,
Taka

Best Answer

Firstly, with observational data, you should no normally just compare groups using a KM curve and/or log-rank test. Data from non-randomized comparisons do typically need other adjustments (see techniques like propensity scores, regression adjustments, matching cases to controls or other appropriate methods suitable to the particular study).

Secondly, the type II error (i.e. error of wrongly not rejecting a null hypothesis) works kind of similarly for the log-rank test as for, say, the t-test. You need the significance level you use (let's assume you use a two-sided significance level of $\alpha=0.05$) log-hazard ratio $\delta<0$ between the two groups that you assume to be true/for which you want the type II error, and the expected number of patients with an event in each arm (let's call it $d_1$ and $d_2$ with $d=d_1+d_2$). The second bit is what replaces the number of patients in each arm and common standard deviation for normally distributed data.

In the simple case of equally sized groups, the formula for power simply becomes the probability that a $N(\delta \sqrt{d/4}, 1)$ random variable falls below $\Phi^{-1}(\alpha/2)$ (=-1.96$\ldots$ for $\alpha=0.05$). E.g. with R you can evaluate that as pnorm(qnorm(alpha), delta * sqrt(d/4), 1). The type II error is simply 1 - power. Most software including R also has functions that implement this including the more general case with unequal group sizes. The trick is simply that you want to Google for "power log-rank test" rather than searching for "type II error".

However, note that for an observational study these power numbers will usually be inappropriate, because adjustments (as discussed above) will typically be needed. There might be special formulae for some situations in the epidemiolocal literature, but unless someone has dealt with your particular situation before and published a formula for it, you may have to go for simulations to obtain the power/type II error and other operating characteristics of your study.

Related Question