Solved – Dealing with no events in one treatment group – survival analysis

rsurvival

I am currently trying to apply survival analysis to several tree species which were monitored for growth and phenology for 4 years and separated into three treatment groups. From this data I have created a survival variable which gives me the information of whether the individual died during these 4 years or not.
I have thus successfully created for each species data suitable for survival analysis using Surv and survfit functions from the "survival" package to create a R survival object and plot this object for the three treatments.

My question is about how to deal with non events in one treatment group. I have for some species a very small number of events (i.e. deaths) leaving me with quite a high number of non-events (for example, for 360 individuals, only 55 events were recorded across treatments, with one treatment with no events at all).

I have already looked up on the internet how to work with these, and I mainly found that it is ok, the likelihood ratio test is still valid (while the Wald test is not). However, this problem gives me very high values for the hazard ratio (exp(coef)) in the summary of the coxph function (like 1.012e+09 associated with a 0.996 p-value, when it is obvious that there is a significant difference between treatments when you look at the plot.)

I was wondering if anyone could help me resolve this problem :

  • is it ok to have such high estimates of the hazard ratio (exp(coef)) ?
  • does it really reflect the observed difference between treatment, or is the p-value really overestimated ?

Any help with how you would deal with this, or how you dealt with it in your previous experience would be gladly accepted.

Best Answer

You've come across an issue that can occur with Cox-PH models (actually, just about all survival regression models). That is, if no events occur in one group, then the estimated effect of that group will be $-\infty$. This is very similar to the issue in general linear models with, say, the binomial family, when you have one group with all 0's or all 1's.

If you are just interested in comparing groups (without adjusting for other covariates), this can be still be done with log-rank statistics: see the function survdiff in R's survival package.

My approach would be as follows: use the Cox-PH model on all the groups that observed at least one event. Then, for the group that had no events, use the log-rank statistic to compare with some baseline group of interest. Make note in the report that the log-rank statistic was used because the Cox-PH model resulted in degenerate estimate in the group with no events.