Solved – coxph ran out of iterations and did not converge

cox-modelrsurvival

Yes, I have checked that previous answers to "Ran out of iterations…" questions do not solve my problem.

I have fault data on Firefox, 899 faults and 1395 (estimated) censored faults. The censoring all happens on one of half a dozen start days and half a dozen end days (the initial/final release of a version).

library(survival)

ff_usage=read.csv("http://www.coding-guidelines.com/R_code/ff_usage.csv", as.is=TRUE)

f_sur=Surv(ff_usage$start, ff_usage$end, event=ff_usage$event)
plot(survfit(f_sur ~ 1))
f_cox=coxph(f_sur ~ total_usage+cluster(fault_id), data=ff_usage)

The Kaplan-Meier curve looks about right.

total_usage is an estimate of the number of Firefox users up until the fault is reported. This is very time dependent and so each fault timeline is broken up into 7 day intervals clustered on fault_id; unsplit original.

The dependency on total_usage (or its log) could be close to 1 (I am hoping for one or the other).

I have tried setting init and increasing iter.max; also strata(src_id) and subsetting on src_id.

Most of the start/end times are estimated and have a regular interval, I have tried adding some randomization, e.g., runif(n, -3, 3). No change.

All I ever see is:

Warning message:
In fitter(X, Y, strats, offset, init, control, weights = weights,  :
  Ran out of iterations and did not converge

Suggestions for things to try welcome.

Best Answer

This may be a case where, as the coxph() documentation page puts it, "the actual MLE estimate of a coefficient is infinity" so that "the associated coefficient grows at a steady pace and a race condition will exist in the fitting routine." In particular, close interrelations of the start / end times with the total_usage variable may be the problem here.

When I have problems with a continuous predictor variable like your total_usage in survival analysis, I examine a split of the continuous variable at the median. Look at survival curves from your data based on a split of total_usage at its median value of $5866.2$ (the coxph() for this simple analysis also didn't converge):

plot(survfit(f_sur~(total_usage > 5866.2),data=ff_usage))

Looks like almost all censoring times and events for the low total_usage cases are before something like time=700, while almost all events and censoring times for the high total_usage subset are greater than that time. Also, examining:

summary(survfit(f_sur~(total_usage > 5866.2),data=ff_usage))

may provide some insight. My data sets are typically much smaller than this, but I have run into related problems in Cox analysis with "a dichotomous variable where one of the groups has no events," so that hazard ratios are ill-defined.

Hope this helps point you in the right direction.

Related Solutions

Different Prediction Plots from Survival Coxph and RMS Cph in R – Comparison

I think there should definitely be a point where the confidence interval is zero width. You might also try a third way which is to use solely rms functions. There is an example under the help file for contrast.rms to get a hazard ratio plot. It starts with the comment # show separate estimates by treatment and sex. You'll need to anti-log to get the ratio.

Solved – R’s coxph won’t converge when I include factor (categorical) variables

coxph() takes an argument control which expects to be passed an object produced by coxph.control().

From ?coxph.control we see that there are two arguments related to the number of iterations iter.max and outer.max. You can try to increase the iter.max value so you call would be

coxph(Surv(time, status) ~ Chemo_Simple, data = dataset,
      control = coxph.control(iter.max = 50))

And then see if that converges. outer.max is not relevant here as your model doesn't contain any pspline terms.

Also, consider changing the starting values via argument init to coxph(). You could use the starting values from lasso fit for example assuming they are on the same scale/for the same parameters as the coxph() implementation.

Best Answer

Related Solutions

Different Prediction Plots from Survival Coxph and RMS Cph in R – Comparison

Solved – R’s coxph won’t converge when I include factor (categorical) variables

Related Question