Survival Analysis – Interpreting Output of Weibull Accelerated Failure Time Model

interpretationself-studysurvivalweibull distribution

In this case study I have to assume a baseline Weibull distribution, and I'm fitting an Accelerated Failure Time model, which will be interpreted by me later on regarding both hazard ratio and survival time.

The data looks like this.

head(data1.1)

TimeSurv IndSurv Treat Age
1     6 days       1     D  27
2    33 days       1     D  43
3   361 days       1     I  36
4   488 days       1     I  54
5   350 days       1     D  49
6   721 days       1     I  49
7  1848 days       0     D  32
8   205 days       1     D  47
9   831 days       1     I  24
10  260 days       1     I  38

I'm fitting a model using the function Weibullreg() in R. The survival function is built reading TimeSurv as the time measures and IndSurv as the indicator of censoring. The covariates considered are Treat and Age.

My issue deals with understanding the output properly:

wei1 = WeibullReg(Surv(TimeSurv, IndSurv) ~ Treat + Age, data=data1.1)
wei1


$formula
Surv(TimeSurv, IndSurv) ~ Treat + Age

$coef
            Estimate           SE
lambda  0.0009219183 0.0006803664
gamma   0.9843411517 0.0931305471
TreatI -0.5042111027 0.2303038312
Age     0.0180225253 0.0089632209

$HR
              HR       LB       UB
TreatI 0.6039819 0.384582 0.948547
Age    1.0181859 1.000455 1.036231

$ETR
             ETR        LB        UB
TreatI 1.6690124 1.0574337 2.6343045
Age    0.9818574 0.9644488 0.9995801

$summary

Call:
survival::survreg(formula = formula, data = data, dist = "weibull")
               Value Std. Error     z      p
(Intercept)  7.10024    0.41283 17.20 <2e-16
TreatI       0.51223    0.23285  2.20  0.028
Age         -0.01831    0.00913 -2.01  0.045
Log(scale)   0.01578    0.09461  0.17  0.868

Scale= 1.02 

Weibull distribution
Loglik(model)= -599.1   Loglik(intercept only)= -604.1
    Chisq= 9.92 on 2 degrees of freedom, p= 0.007 
Number of Newton-Raphson Iterations: 5 
n= 120

I don't really get how Scale = 1.02 and log(scale) = 0.015, and if the p-value of this log(scale) is a big non-signfificant one, from how the documentation of the function shows some conversions it makes, am I to assume that the values of the alphas are also not to be trusted (considering they were reached using the scale value)?

Best Answer

Many (including me) get confused by the different ways to define the parameters of a Weibull distribution, particularly since the standard R Weibull-related functions in the stats package and the survreg() parametric fitting function in the survival package use different parameterizations.

The manual page for the R Weibull-related functions in stats says:

The Weibull distribution with shape parameter $a$ and scale parameter $b$ has density given by $$\frac{a}{b}\left(\frac{x}{b}\right)^{a-1}e^{-(x/b)^{a}}$$ for $x$ > 0.

That's called the "standard parameterization" on the Wikipedia page (where they use $k$ for shape and $\lambda$ for scale).

The survreg() function uses a different parameterization, with differences explained on its manual page:

There are multiple ways to parameterize a Weibull distribution. The survreg function embeds it in a general location-scale family, which is a different parameterization than the rweibull function, and often leads to confusion.

survreg's scale = 1/(rweibull shape)

survreg's intercept = log(rweibull scale).

The WeibullReg() function effectively takes the result from survreg() and expresses the results in terms of the "standard parameterization."

There is a potential confusion, however, as the $summary of the object produced by WeibullReg is "the summary table from the original survreg model." (Emphasis added.) So what you have displayed in the question includes results for both parameterizations.

That dual representation of the results helps explain what's going on.

Starting from the bottom, the survreg value of scale is the reciprocal of the "standard parameterization" value of shape. The "standard" shape parameter is called gamma in the WeibullReg $formula output near the top of your output. The value for gamma is 0.98434, with a reciprocal of 1.0159, rounding to the value of 1.02 shown as Scale in the last line of your output. The natural logarithm of 1.0159 is 0.01578, shown as Log(scale) in the next-to-last line. Those last lines of your output, remember, are based on the survreg definition of scale.

The p-value for that Log(scale) is indeed very high. But that just means that the value of Log(scale) is not significantly different from 0, or that the scale itself (as defined in survreg) is not different from 1. That has nothing to do with the hazard ratios and so forth for the covariates. It just means that the baseline survival curve of your Weibull model can't be statistically distinguished from a simple exponential survival curve, which would have exactly a value of 1 for survreg scale or "standard" shape and a constant baseline hazard over time. So there is nothing to distrust about your results on that basis.