Solved – R’s coxph won’t converge when I include factor (categorical) variables

categorical dataconvergencecox-modelr

I have a dataset of 371 observations. When I run coxph with numeric variables it works fine. However, when I try to add factor (categorical) variables it returns “Ran out of iterations and the model did not converge”.

Of note, when I restructure all factors to binary variables with dummy and use glmnet-lasso the model converges.

Here are examples of the code and output (including summary description of the variables):

> maxSTree.cox <- coxph (Surv(time,status)~Chemo_Simple, data=dataset)

Warning message:
In fitter(X, Y, strats, offset, init, control, weights = weights,  :
  Ran out of iterations and did not converge

> summary (dataset$Chemo_Simple)
         Anthra-HDAC       Anthra-Plus       ArsenicAtra              ATRA           ATRA-GO 
                0               163                 2                12                 0                 2 
         ATRA_IDA Demeth-HistoneDAC          Flu-HDAC     Flu-HDAC-plus         HDAC-Clof         HDAC-only 
                0                34                37                 4                24                 1 
        HDAC-Plus        LowArac+/-       LowDAC-Clof         MYLO+IL11    No Rx in MDACC            Phase1 
                4                 8                30                 5                 1                 5 
              SCT    StdARAC-Anthra      StdAraC-Plus          Targeted         VNP40101M 
                0                 0                 0                13                23 

Best Answer

coxph() takes an argument control which expects to be passed an object produced by coxph.control().

From ?coxph.control we see that there are two arguments related to the number of iterations iter.max and outer.max. You can try to increase the iter.max value so you call would be

coxph(Surv(time, status) ~ Chemo_Simple, data = dataset,
      control = coxph.control(iter.max = 50))

And then see if that converges. outer.max is not relevant here as your model doesn't contain any pspline terms.

Also, consider changing the starting values via argument init to coxph(). You could use the starting values from lasso fit for example assuming they are on the same scale/for the same parameters as the coxph() implementation.