Survival – Confirming Correct Use of Exponential Distribution with Survreg() Function

exponential distributionrsurvival

I've been examining fitting the Weibull and lognormal distributions with the survreg() function of the survival package. Fitting the Weibull distribution took some transformation for standard parameterization (per R dweibull()) as shown here: How to generate multiple forecast simulation paths for survival analysis?

I'm now moving on to the exponential distribution. [See https://stats.stackexchange.com/questions/616351/how-to-assign-reasonable-scale-parameters-to-randomly-generated-intercepts-for-t for an example of the exponential distribution.] Could someone please confirm if the exponential distribution is being correctly fit in the R code posted at the bottom and as illustrated in the following image? If not, how do I correctly fit exponential? I only use the lung dataset for ease of example even though it doesn't provide good fit: Weibull provides the best fit.

enter image description here

Code:

library(survival)

time <- seq(0, 1000, by = 1)

fit <- survreg(Surv(time, status) ~ 1, data = lung, dist = "exponential")

survival <- 1 - pexp(time, rate = 1 / fit$coef)

plot(time, survival, type = "l", xlab = "Time",ylab = "Survival Probability",col = "red", lwd = 3)
lines(survfit(Surv(time, status) ~ 1, data = lung), col = "blue")
legend("topright",legend = c("Fitted exponential","Kaplan-Meier" ),col = c("red", "blue"),lwd = c(3, 1),bty = "n")

Best Answer

You've gotten trapped by location-scale modeling again. The model you fit is:

$$\log(T)\sim \beta_0 + W, $$

where $\beta_0$ is your fit$coef (location) and $W$ represents a standard minimum extreme value distribution. The scale factor multiplying $W$ for a corresponding Weibull model is set exactly to 1 for an exponential model.

Thus $\beta_0$ represents a value in the log scale of time. For linear time, you need to exponentiate it to get the rate argument to supply to pexp().

1/exp(fit$coef)
# (Intercept) 
# 0.002370928 

Try that.