Survival Analysis – Addressing No Failures in One Group Through Survival Analysis

regressionsurvival

I have a question regarding survival analyses: Imagine I have a large cohort, where most participants have been given one treatment (treatment A), and a much much smaller subpopulation which has been given another treatment (say, treatment B).
We follow both groups for 5 years. In group receiving treatment B, no one dies. In group A, 10% die.
Now, there are obvious potential issues here around bias and sampling that could be going on. But let's say both groups are well-matched, apart say for one or two covariates. What method would be best to get some insight into whether treatment B may actually be superior, or if it's just a sampling effect or something else going on, particularly given the small sample size of group B? Thank you!

Best Answer

If you don't have to correct for covariates, then you can evaluate the difference with a standard log-rank test. See this answer.

If you are doing Cox modeling to control for covariates, the Wald test typically reported on the coefficient for the treatment will be useless. Therneau and Grambsch discuss this situation in Section 3.5, "Infinite Coefficients," of Modeling Survival Data--Extending the Cox Model, and use the following (contrived) example that is sure to lead to an infinite coefficient:

library(survival)
fit <- coxph(Surv(futime, fustat) ~ rx + 
           fustat, ovarian)
## warnings not shown here
summary(fit)
# Call:
# coxph(formula = Surv(futime, fustat) ~ rx + 
   fustat, data = ovarian)
# 
#   n= 26, number of events= 12 
# 
#              coef  exp(coef)   se(coef)      z Pr(>|z|)
# rx     -5.566e-01  5.731e-01  6.199e-01 -0.898    0.369
# fustat  2.258e+01  6.414e+09  1.387e+04  0.002    0.999
# 
#        exp(coef) exp(-coef) lower .95 upper .95
# rx     5.731e-01  1.745e+00      0.17     1.932
# fustat 6.414e+09  1.559e-10      0.00       Inf
# 
# Concordance= 0.897  (se = 0.037 )
# Likelihood ratio test= 30.8  on 2 df,   p=2e-07
# Wald test            = 0.81  on 2 df,   p=0.7
# Score (logrank) test = 29.09  on 2 df,   p=5e-07

They say:

We do not view this as a serious concern at all, other than an annoying numerical breakdown of the Wald approximation. One is merely forced to do the multiple fits necessary for a likelihood ratio or score test.

I think that the likelihood-ratio test for individual coefficients is implemented directly in SAS. With R, you can examine the log-likelihood as a function of treatment-coefficient values beta (the profile likelihood) as Therneau and Grambsch outline in Section 3.4.1:

...first fit the overall model using all covariates...then [fit] a sequence of Cox models. For each trial value of beta, an offset term is used to include beta * [your treatment indicator] in the model as a fixed covariate. This essentially fixes the coefficient at the chosen value, while allowing the other coefficients to be maximized.

Here's how to do this for their contrived example:

beta <- seq(-1,23,length=500)
llik <- double(500)
for (i in 1:500) { temp <- coxph(Surv(futime, 
    fustat) ~ rx + offset(beta[i]*fustat), 
    data=ovarian);llik[i] <- temp$loglik[2]}
## There were 50 or more warnings (use 
## warnings() to see the first 50)
plot(beta, llik, type="l", 
     ylab="Partial likelihood", bty="n")
temp <- fit$loglik[2] - qchisq(0.95,1)/2 
       ## for 95% CI
abline(h=temp,lty=2)

For the intersection point:

beta[which.min(abs(llik-(fit$loglik[2] 
     -qchisq(0.95,1)/2)))]
## [1] 2.751503

So the 95% CI for the Cox regression coefficient would be 2.75 to infinity.

Whether you are using a log-rank test without covariate adjustment or a Cox model with it, repeat the modeling and analysis on multiple bootstrapped samples of your data to evaluate the robustness of the result.

Related Solutions

Solved – Survival analysis with rare events: Is it legitimate to use a fixed continuity correction for hazard ratio calculation in a single study

GraphPad Prism uses what its software guide calls the "Mantel-Haenszel" approach as one of its ways to estimate a hazard ratio (HR). That is based on the difference between the number of events observed in one of the groups and the number of events that would have been expected in that group if there were no difference in survival between the 2 groups. It essentially works with an HR estimate between the survival curve of the group with events and a weighted average survival curve including both groups. That provides estimates of the HR and of associated confidence intervals (CI), even if there are no events in one of the 2 groups.

The other method used by GraphPad Prism for HR estimates is what they call the "logrank" approach.* This is based on the ratios of observed events in the 2 groups and thus will give either 0 or infinity for the HR in your case, consistent with your expectation. As its software guide says: "the [HR] results can differ when several subjects die at the same time or when the hazard ratio is far from 1.0."

The advantage of using the "Mantel-Haenszel" HR estimate in this type of situation is that it at least gives some idea of possible error in the hazard ratio and thus of an effect size. One disadvantage is that you have no way to test the underlying proportional hazard assumption that makes the HR interpretable. Another possible disadvantage is that I'm not sure how valid the assumptions underlying the calculation of HR and CI would be in this situation, even if there were a true proportional hazard that would be seen with more data or later time points.

Provided that authors appropriately cite the method used (something like: "Mantel-Haenszel estimates from GraphPad Prism, version 7"), then there is nothing necessarily wrong with presenting those HR and CI values. You are certainly correct, however, that interpretation of those values can be open to question.

What you found about adding 0.5 to cells with 0 counts as a "continuity correction" isn't what GraphPad Prism is doing here. That is done in meta-analyses to allow pooling of information from contingency tables across multiple studies. Adding to the confusion in nomenclature, the associated test for such meta-analysis is often called the "Cochran-Mantel-Haenszel" test.

*They also note that what they call the "Mantel-Haenszel" approach is called instead the "logrank" approach in the reference they cite but don't seem to link to. Terminology here can be quite confusing, differing among sources.

Survival Analysis – Techniques for Mostly Censored Data in Clinical Trials

Everything you want to do is theoretically possible, following the same procedures as usual for survival analysis. In practice, however, you have very little information, so the conclusion is likely to be that you can't tell which treatment is more effective (e.g. wide confidence intervals on (e.g.) the coefficient giving you the log-hazard difference between the groups).

On the other hand, this quick summary suggests that you might actually have enough information if your numbers of deaths are really approx 2 vs 10.

dd <- data.frame(trt = rep(c("A", "B"), each=100))
set.seed(101)
dd$time <- rexp(200, rate = ifelse(dd$trt == "A", 0.02, 0.1))
dd$death <- dd$time < 1
dd$time[!dd$death] <- 1.0
library(survival)
cc <- coxph(Surv(time, event = death) ~ trt, dd)
> summary(cc)
Call:
coxph(formula = Surv(time, event = death) ~ trt, data = dd)

  n= 200, number of events= 13 

       coef exp(coef) se(coef)    z Pr(>|z|)  
trtB 1.7451    5.7267   0.7687 2.27   0.0232 *

Best Answer

Related Solutions

Solved – Survival analysis with rare events: Is it legitimate to use a fixed continuity correction for hazard ratio calculation in a single study

Survival Analysis – Techniques for Mostly Censored Data in Clinical Trials

Related Question