Solved – Meaning of flexsurv’s flexsurvreg res.t outputs

rregression coefficientssurvivalweibull distribution

I am trying to understand the meaning of the coefficients estimates of the output of flexsurv's flexsurvreg function. For example, let us assume I want to perform the survival analysis and fit of a Weibull model with respect to a covariate. I will call flexsurvreg in order to obtain the Weibull parameters and covariate coefficient of the best fit flexsurvreg can obtain through its standard method.

fit <- flexsurvreg(Surv(time, censored)~covariate, data=struct, dist="weibull")

fit$res.t then returns the estimates of coefficients in such a fashion for example:

              est
shape         3
scale         4
covariate    -0.3

From there on, I want to try to reconstruct the analytic expression of the hazard function. From my understanding, it should be built in this fashion (for a Weibull model):

$h(t) = \frac{\text{shape}}{\text{scale}} \left(\frac{t}{\text{scale}}\right)^{\text{shape}-1}\cdot \exp\left(\text{coefficient} \cdot (\text{covariate}-\mu)\right)$

with $\text{coefficient}$ being the covariate coefficient returned in the flexsurvreg output; $\text{covariate}$ being the covariate value for which I intend to construct the function and $\mu$ the mean value of the covariate over the fitting sample.

However, the plot of this function does not match the one that plot(fit, type="hazard", newdata=list(covariate=cov)) returns. Why is that? I would guess my analytic interpretation of the coefficient estimates is wrong. What is then the mathematical meaning of these estimates?

Update: I have been through the package sources. It is still very unclear to me how the plot is done because I am not at all familiar with R's syntax nor the functions that are called.

However, I found out that I was wrong using fit$res.t, whereas fit$res actually is supposed to give out the parameters along the natural scale, as opposed to a logarithmic scale.

I also noticed that the computation of the analytical curve is likely done in the summary.flexsurvreg function.

Finally, the fit object contains functions that may be related to the computation of the hazard function in fit$dfns. How they take the covariate value into account still is fuzzy to me.

Best Answer

In actuality, the way flexsurv coefficients estimates work is slightly different from the one implied by the traditional PH model. The covariate result in fit$res (or fit$res.t) is not to be considered as the weight of the covariate as traditionally represented by Cox PH model (i.e. the $\text{coefficient}$ in the formula of the question), but much rather as a modifier for the distribution parameters. The way the parameters are modified depends on flexsurv's method add.covs (in the case of a Weibull distribution, it seems only the scale parameter is affected, but that depends on the distribution). This is mathematically equivalent, but dramatically changes the value of the output parameter, as its definition is very different.

In the case of a Weibull distribution, the add.covs method transforms the scale parameter as follows (on the natural scale):

$\text{scale}^\star = \text{scale} \cdot \exp \left(\text{coefficient} \cdot (\text{covariate})\right)$

with $\text{scale}^\star$ being the transformed parameter, $\text{coefficient}$ and $\text{covariate}$ both matching the definition of the question.

Then, the model hazard function is constructed analytically as usual:

$h(t) = \frac{\text{shape}}{\text{scale}^\star}\left(\frac{t}{\text{scale}^\star}\right)^{\text{shape}-1}$

This allows to correctly reproduce the curve as plotted by plot(fit, type="hazard", newdata=list(covariate=cov)).

Related Solutions

Solved – Why do coxph() and survreg(, dist=”exponential”) NOT return the same coefficient (except for expected opposite sign) in R

It has just been pointed out to me that I made a mistake in my simulation. So my theoretically based assumption was sound: under the exponential AFT model coxph() and survreg(,dist="exponential") should provide on average the same answer. However my implementation of the exponential AFT model in the R simulation was incorrect. I simulated:

logT     <- -d1*X1 + e
T        <- exp(logT)

However, under my assumptions, it is not logT that is exponentially distributed, but T. So instead, I should have replaced the above two lines in my code with:

T     <- exp(-d1*X1)*e

After this change, the code produced the results I expected.

Solved – Meaning of Weibull scale and shape from flexsurvreg

Weibull distribution has probability density function

$$ f(x;\lambda,k) = \begin{cases} \frac{k}{\lambda}\left(\frac{x}{\lambda}\right)^{k-1}e^{-(x/\lambda)^{k}} & x\geq0 ,\\ 0 & x<0 \end{cases} $$

where $\lambda>0$ is scale parameter and $k>0$ is shape parameter. Different values of parameters are presented on the plot below.

Basically, as the names suggests, shape parameter controls it's shape and scale parameter makes it wider or narrower (notice the $x/\lambda$ parts of probability density function). To gain more intuition you can plot yourself the distribution with different parameter values and check what happens when you change them.

The description of shape parameter is nicely summarized on Wikipedia:

If the quantity $X$ is a "time-to-failure", the Weibull distribution gives a distribution for which the failure rate is proportional to a power of time. The shape parameter, $k$, is that power plus one, and so this parameter can be interpreted directly as follows:

A value of $k < 1$ indicates that the failure rate decreases over time. This happens if there is significant "infant mortality", or defective items failing early and the failure rate decreasing over time as the defective items are weeded out of the population. In the context of the diffusion of innovations, this means negative word of mouth: the hazard function is a monotonically decreasing function of the proportion of adopters;

A value of $k = 1$ indicates that the failure rate is constant over time. This might suggest random external events are causing mortality, or failure. The Weibull distribution reduces to an exponential distribution;

A value of $k > 1$ indicates that the failure rate increases with time. This happens if there is an "aging" process, or parts that are more likely to fail as time goes on. In the context of the diffusion of innovations, this means positive word of mouth: the hazard function is a monotonically increasing function of the proportion of adopters. The function is first concave, then convex with an inflexion point at $(e^{1/k} - 1)/e^{1/k}, k > 1$.

Best Answer

Related Solutions

Solved – Why do coxph() and survreg(, dist=”exponential”) NOT return the same coefficient (except for expected opposite sign) in R

Solved – Meaning of Weibull scale and shape from flexsurvreg

Related Question