Solved – Why use ln-ln plot in proportional hazard test

cox-modelkaplan-meierregressionsurvival

I recently began studying survival analysis and there is something I am curious about.

Why do we prefer to use the ln-ln survival curve rather than the survival curve in a proportional hazard test? As you know, the $\hat{S}$ we use in ln-ln formula is the survival function based on Kaplan-Meier curves, so why we don't use the untransformed KM survival function?

Best Answer

Let us define the linear predictor (i.e. log hazard ratio) $\eta = X^T \beta$. Then the proportional hazards model can be written as

$h(t | \eta) = h_o(t) \exp(\eta)$

This relation is equivalent to

$S(t | \eta) = S_o(t)^{\exp(\eta)}$

Therefore, if we plot the cloglog (complimentary log log, or ln -ln as stated in the question) of the survival functions, we get

$\text{cloglog} (S(t | \eta) ) = \text{cloglog} (S_o(t)) - \eta$

i.e. if the proportional hazards assumption is true, the curves should differ only by a constant. It's much easier to visually assess whether two curves differ by an additive constant than whether one differs by an exponential factor.

As an example, here is simulated data that does not follow the proportional hazards model (it follows a proportional odds model instead). Looking at the cloglog plots (with the average of the two cloglog functions removed for easier comparison), we can see that the difference between these two functions is not exactly constant.

However, looking at the two survival curves, you would have to have a much better eye than I will ever have to determine that these two curves do not differ by an exponential factor.

Related Solutions

Solved – Why does the hazard ratio represent the magnitude of distance between the Kaplan-Meier plots

The Cox proportional hazards model can be written in terms of the effect of predictor variables on the log relative hazard, which is also the effect on the log relative cumulative hazard scale. Log cumulative hazard is equal to the log of the -log of the cumulative survival function which Kaplan-Meier estimates. So you could say that the log hazard ratio (regression effect in the Cox model) estimates the average difference between two Kaplan-Meier estimates if you transform both of them by the log-log transformation.

Solved – How to plot adjusted Kaplan-Meier Curves

The only way to provide differential survival with true KM curves is to generate new curves for different groups. You could then display a curve for all persons of group 3, for example. The number of units in each group will decrease as the number of strata increase. However, this method is empiric and does not truly adjust the sample to some chosen set of values.

I am most familiar with methods for obtaining adjusted curves derived from Cox or parametric survival models. Generally speaking, the role of an adjusted curve is to graphically display the expected mortality (or mortality transformed to survival) of the sample if a single or combination of values is set to some fixed value or set of values, respectively.

For example, one might find from a Cox model the hazard ratio for blood pressure is 1.1. Thus, for each 1 unit increase in blood pressure, the average hazard at a given time point multiplies by 1.1. Now, if we wanted to display the mortality curve for all units under analysis (sample) adjusted to a blood pressure of 1 standard deviation above the mean, we could display an adjusted curve.

Here is a self-contained example using group for your reference. Note that the final, adjusted curve is for the mean of group which, for most applications, would have no real meaning. Also note that transformation from mortality to survival are required for this method.

library("survival")
require("survival")

days <- rpois(100, 3)
status <- rbinom(100,1,0.34)
group <- sample(c(1,2,3,4), 100, replace=TRUE)
df <- data.frame(days, status)

#overall survival
surv <- survfit(Surv(days, status)~1)
summary(surv)
plot(surv)

#survival by group
kmsurv <- survfit(Surv(days,status) ~ strata(group), df)
plot(kmsurv)

#survival adjusted to group effect
cox <- coxph(Surv(days,status)~group, df)
summary(cox)
plot(survfit(cox))

Essential information on the R code can be found here.

Lastly, it is my opinion that adjusted survival analysis is generally statistical sleight of hand as the adjustment process can 1) be used for unrealistic patterns of covariates, 2) fool the reader into believing that non-significant effects can result in some displayed survival/mortality pattern and 3) be confused with empiric curves, leading readers to believe you have more events or information for each subgroup/pattern than you actually possess. I would carefully consider why adjusted curves are desirable over adjusted hazard ratios before spending too much time.

Best Answer

Related Solutions

Solved – Why does the hazard ratio represent the magnitude of distance between the Kaplan-Meier plots

Solved – How to plot adjusted Kaplan-Meier Curves

Related Question