Solved – KM versus Cox model

cox-modelkaplan-meiersurvival

This question is in general, but I would also like to know how the answer pertains to building the model in R.

I have some time to event data and want to do some survival analysis for it. For sake of example imagine that the response is some time to death value and that I have three groups A, B, and C. Now if I plot a KM curve for this data, given the fact that I have 3 groups, I will get three curves (one for each group).

No I am fitting the following model in R

model = coxph(Surv(time,censor) ~ group, data=death.data)

where group is a categorical variable with 3 levels pertaining to the three different groups: A, B, or C.

I am using the following command to plot the survival curves

plot(survfit(model,xlab ="Time",ylab="Survival"))

however, this only plots one curve. Is this what I should expect? Really I wanted three curves but am only getting one.

Does this imply that I need to build three survival models? One for each group, for example one model for group A versus other, one model for group B versus others, etc.? And if so, could I plot them all on the same graph and this would be equivalent to the KM curve representation?

Best Answer

What survfit() does if you do not specify a newdata argument is to try to give you a survival curve of an "average" individual. This is usually a bad idea because it is not obvious what you get out of it.

In your case you should specify this way: plot(survfit(model, newdata = data.frame("group" = c("group1", "group2"))) and in that case you will get two survival curves, one for an individual with group = "group1" and one for an individual with group = "group2".

For example I generated some data:

times <- rexp(30)
status <- rep(1, 30)
groups <- rep(c(1,2,3), 10)

dat <- data.frame(groups, times, status)
dat[dat$groups==1, "times"] <-  dat[dat$groups==1, "times"]+6

Then I fit the Cox model and i try to get the survival plots:

mod1<- coxph(Surv(times, status) ~ groups, dat)

plot(survfit(mod1)) # an "average" individual, in this case the same as someone from group 2
plot(survfit(mod1, newdata = data.frame("groups" = 1)))
plot(survfit(mod1, newdata = data.frame("groups" = 2)))
plot(survfit(mod1, newdata = data.frame("groups" = 3)))

# Plot all three survival curves in one plot
plot(survfit(mod1, newdata = data.frame("groups" = c(1,2,3))))
Related Question