Solved – Meaning of flexsurv’s flexsurvreg res.t outputs

rregression coefficientssurvivalweibull distribution

I am trying to understand the meaning of the coefficients estimates of the output of flexsurv's flexsurvreg function. For example, let us assume I want to perform the survival analysis and fit of a Weibull model with respect to a covariate. I will call flexsurvreg in order to obtain the Weibull parameters and covariate coefficient of the best fit flexsurvreg can obtain through its standard method.

fit <- flexsurvreg(Surv(time, censored)~covariate, data=struct, dist="weibull")

fit$res.t then returns the estimates of coefficients in such a fashion for example:

              est
shape         3
scale         4
covariate    -0.3

From there on, I want to try to reconstruct the analytic expression of the hazard function. From my understanding, it should be built in this fashion (for a Weibull model):

$h(t) = \frac{\text{shape}}{\text{scale}} \left(\frac{t}{\text{scale}}\right)^{\text{shape}-1}\cdot \exp\left(\text{coefficient} \cdot (\text{covariate}-\mu)\right)$

with $\text{coefficient}$ being the covariate coefficient returned in the flexsurvreg output; $\text{covariate}$ being the covariate value for which I intend to construct the function and $\mu$ the mean value of the covariate over the fitting sample.

However, the plot of this function does not match the one that plot(fit, type="hazard", newdata=list(covariate=cov)) returns. Why is that? I would guess my analytic interpretation of the coefficient estimates is wrong. What is then the mathematical meaning of these estimates?


Update: I have been through the package sources. It is still very unclear to me how the plot is done because I am not at all familiar with R's syntax nor the functions that are called.

However, I found out that I was wrong using fit$res.t, whereas fit$res actually is supposed to give out the parameters along the natural scale, as opposed to a logarithmic scale.

I also noticed that the computation of the analytical curve is likely done in the summary.flexsurvreg function.

Finally, the fit object contains functions that may be related to the computation of the hazard function in fit$dfns. How they take the covariate value into account still is fuzzy to me.

Best Answer

In actuality, the way flexsurv coefficients estimates work is slightly different from the one implied by the traditional PH model. The covariate result in fit$res (or fit$res.t) is not to be considered as the weight of the covariate as traditionally represented by Cox PH model (i.e. the $\text{coefficient}$ in the formula of the question), but much rather as a modifier for the distribution parameters. The way the parameters are modified depends on flexsurv's method add.covs (in the case of a Weibull distribution, it seems only the scale parameter is affected, but that depends on the distribution). This is mathematically equivalent, but dramatically changes the value of the output parameter, as its definition is very different.

In the case of a Weibull distribution, the add.covs method transforms the scale parameter as follows (on the natural scale):

$\text{scale}^\star = \text{scale} \cdot \exp \left(\text{coefficient} \cdot (\text{covariate})\right)$

with $\text{scale}^\star$ being the transformed parameter, $\text{coefficient}$ and $\text{covariate}$ both matching the definition of the question.

Then, the model hazard function is constructed analytically as usual:

$h(t) = \frac{\text{shape}}{\text{scale}^\star}\left(\frac{t}{\text{scale}^\star}\right)^{\text{shape}-1}$

This allows to correctly reproduce the curve as plotted by plot(fit, type="hazard", newdata=list(covariate=cov)).