Survival Analysis – Relationship Between Survival Distributions and Log Linear Regression in Accelerated Failure Time Models

accelerated-failure-timefailure-ratesurvival

The accelerated failure time model (AFT) can be expressed as:

$S_{1}(t) = S_{0}(\frac{t}{\gamma})$ where $S_{0}$ is a specified baseline survival distribution and $\frac{1}{\gamma}$ is the accelerant/decelerant.
The probability of survival in group 1 ($S_{1}$) at time t is the same as the probability of survival in the baseline group ($S_{0}$) at time $\frac{t}{\gamma}$. So the baseline group is considered to age $\gamma$ as fast as group 1. $\gamma$ is modeled using the exponent of a linear combination of features ($\gamma(x)=\exp\{\theta\cdot X\}$). So $S_{1}(t) = S_{0}\left(\frac{t}{\exp\{\theta\cdot X\}}\right)$

references: 7.3.1 Accelerated Life Models, Lecture eighteen: The accelerated Failure
Time (AFT) Model
, lifelines:Accelerated failure time models

The accelerated failure time model (AFT) can also be expressed as a linear function of log time (log linear regression):

$\log(T) = (\theta\cdot X) + \sigma\epsilon$ where $\epsilon$ is a specified distribution of the error and $\sigma$ is a scale factor

references:Accelerated Failure Time Models pg.15, Survival Analysis with Accelerated Failure Time

What is the mathematical relationship between the two above expressions of the accelerated failure time model? From $\log(T) = (\theta\cdot X) + \sigma\epsilon$ how do you get $S_{1}(t) = S_{0}(\frac{t}{\gamma})$ or vice versa?

Best Answer

This adapts the very useful and concise course notes on parametric survival by Germán Rodríguez, Section 2.2, to your formulation of the accelerated failure time model.

In your form, $\log(T) = (\theta\cdot X) + \sigma\epsilon$, the baseline survival curve, the probability that an individual survives beyond time $t$ at the reference levels of covariates (that is, with $\theta\cdot X=0$) is:

$$ S_0(t) = \Pr\{T_0 > t\}= \Pr\{\epsilon>\log(t)/\sigma \},$$

where the baseline time scale is defined with respect to the random term, $T_0= \exp\{\sigma \epsilon \}$. For a set of non-baseline covariate values $x$, the corresponding random variable $T$ is thus distributed as $T_0 e^{\theta\cdot X}$. The corresponding survival curve conditional on those covariate values is:

$$S(t, x) = \Pr\{T >t|x \} =\Pr\{T_o e^{\theta\cdot X} >t \}=\Pr\{T_0 > t e^{-\theta\cdot X} \} = S_0(t e^{-\theta\cdot X}),$$

illustrating the effective time compression or expansion by a factor of $e^{-\theta\cdot X}$ as a function of covariate values.