The only way to provide differential survival with true KM curves is to generate new curves for different groups. You could then display a curve for all persons of group 3, for example. The number of units in each group will decrease as the number of strata increase. However, this method is empiric and does not truly adjust the sample to some chosen set of values.
I am most familiar with methods for obtaining adjusted curves derived from Cox or parametric survival models. Generally speaking, the role of an adjusted curve is to graphically display the expected mortality (or mortality transformed to survival) of the sample if a single or combination of values is set to some fixed value or set of values, respectively.
For example, one might find from a Cox model the hazard ratio for blood pressure is 1.1. Thus, for each 1 unit increase in blood pressure, the average hazard at a given time point multiplies by 1.1. Now, if we wanted to display the mortality curve for all units under analysis (sample) adjusted to a blood pressure of 1 standard deviation above the mean, we could display an adjusted curve.
Here is a self-contained example using group
for your reference. Note that the final, adjusted curve is for the mean of group
which, for most applications, would have no real meaning. Also note that transformation from mortality to survival are required for this method.
library("survival")
require("survival")
days <- rpois(100, 3)
status <- rbinom(100,1,0.34)
group <- sample(c(1,2,3,4), 100, replace=TRUE)
df <- data.frame(days, status)
#overall survival
surv <- survfit(Surv(days, status)~1)
summary(surv)
plot(surv)
#survival by group
kmsurv <- survfit(Surv(days,status) ~ strata(group), df)
plot(kmsurv)
#survival adjusted to group effect
cox <- coxph(Surv(days,status)~group, df)
summary(cox)
plot(survfit(cox))
Essential information on the R
code can be found here.
Lastly, it is my opinion that adjusted survival analysis is generally statistical sleight of hand as the adjustment process can 1) be used for unrealistic patterns of covariates, 2) fool the reader into believing that non-significant effects can result in some displayed survival/mortality pattern and 3) be confused with empiric curves, leading readers to believe you have more events or information for each subgroup/pattern than you actually possess. I would carefully consider why adjusted curves are desirable over adjusted hazard ratios before spending too much time.
Best Answer
Let us define the linear predictor (i.e. log hazard ratio) $\eta = X^T \beta$. Then the proportional hazards model can be written as
$h(t | \eta) = h_o(t) \exp(\eta)$
This relation is equivalent to
$S(t | \eta) = S_o(t)^{\exp(\eta)}$
Therefore, if we plot the cloglog (complimentary log log, or ln -ln as stated in the question) of the survival functions, we get
$\text{cloglog} (S(t | \eta) ) = \text{cloglog} (S_o(t)) - \eta$
i.e. if the proportional hazards assumption is true, the curves should differ only by a constant. It's much easier to visually assess whether two curves differ by an additive constant than whether one differs by an exponential factor.
As an example, here is simulated data that does not follow the proportional hazards model (it follows a proportional odds model instead). Looking at the cloglog plots (with the average of the two cloglog functions removed for easier comparison), we can see that the difference between these two functions is not exactly constant.
However, looking at the two survival curves, you would have to have a much better eye than I will ever have to determine that these two curves do not differ by an exponential factor.