Solved – How to calculate survival probability to year 1 using ‘survival’ package in R

kaplan-meierrsurvival

I am trying to calculate the probability of surviving to year 1 of age (ie, 365 days) for a group of monkeys. Data set has three columns: animal ID, days to death, and censored status (1= individual died during study; 0 = individual alive at the end of the study). Like this:

ID   days   status
mad  45     1
bad  135    1
kom  1564   0

How do I use a Kaplan-Meier estimate to calculate the probability that an individual will survive the first 365 days of its live?

When I use this code it does not even compute a median value:

survfit(Surv(comp.dat[,"days"], comp.dat[,"status"]) ~ 1 ) 
Call: survfit(formula = Surv(comp.dat[, "days"], comp.dat[, "status"]) ~ 
1)

  n  events  median 0.95LCL 0.95UCL 
 41      14      NA     293      NA 

Thanks!

Best Answer

I cannot discern exactly the issue with the R code you provide. I created a self-contained example to illustrate how to find the 1-year (or any desired interval) survival estimate and the median survival interval (or any other percentile):

library("survival")
require("survival")

days <- rpois(100, 365)
status <- rbinom(100,1,0.34)

surv <- survfit(Surv(days, status)~1)
summary(surv)

This code provides the following "life table" demonstrating the number of units/subjects at-risk, number with events, Kaplan-Meier (KM) survival estimate and the 95% CI for the Kaplan-Meier estimator:

time n.risk n.event survival std.err lower 95% CI upper 95% CI
358     71       1    0.838 0.03849        0.766        0.917
360     69       1    0.826 0.03980        0.751        0.908
361     68       1    0.814 0.04103        0.737        0.898
364     65       1    0.801 0.04226        0.723        0.888
367     61       1    0.788 0.04356        0.707        0.878
368     58       3    0.747 0.04724        0.660        0.846
369     52       3    0.704 0.05065        0.612        0.811
370     49       1    0.690 0.05161        0.596        0.799
371     46       1    0.675 0.05263        0.579        0.786
373     44       2    0.644 0.05452        0.546        0.760
374     42       2    0.613 0.05607        0.513        0.734
375     40       1    0.598 0.05673        0.497        0.720
379     32       1    0.579 0.05796        0.476        0.705
381     30       1    0.560 0.05915        0.455        0.689
382     29       2    0.522 0.06106        0.415        0.656
384     26       2    0.481 0.06260        0.373        0.621
387     20       1    0.457 0.06393        0.348        0.601
388     18       1    0.432 0.06523        0.321        0.581
389     15       1    0.403 0.06694        0.291        0.558
397      8       1    0.353 0.07518        0.232        0.536
401      6       1    0.294 0.08250        0.170        0.510

The statistic of interest is the 1-year survival. If there were an event at 365 days, then one can read the KM estimator for one year exactly. In this example, and in many circumstances, an event does not occur on the exact date of interest. When this occurs, by convention, we take the survival estimate from the interval closest to, but not exceeding, the time interval of interest. In this case, the 1-year survival is 0.801 taken from the 364-day event time.

The median survival time is obtained by finding the interval closest to, but not more than, 50% survival. In this case, median survival time is 384 days.

Note that the median survival time will not be reported if survival remains >50% in your sample at the end of follow-up.

I generally find the UCLA Institute for Digital Research and Education to be an excellent resource for code examples and lucid explanations of basic and advanced survival analysis concepts: http://www.ats.ucla.edu/stat/r/examples/asa/asa_ch2_r.htm

Related Question