Solved – Why do coxph() and survreg(, dist=”exponential”) NOT return the same coefficient (except for expected opposite sign) in R

cox-modelrsurvivalweibull distribution

If I understand correctly, the coefficient of a covariate X under a Weibull accelerated failure time (AFT) model is related to the log(hazard ratio) of the Cox proportional hazard (PH) model in the following way:

d1 = -c1/shape.gamma

where d1 is the AFT coefficient, c1 is the Cox PH coefficient log(hazard ratio), and shape.gamma is the shape parameter of the Weibull distribution. See for example slide 36 in this handout.

So for shape.gamma=1 (the Weibull distribution is then equal to the exponential distribution) d1 and c1 should be equal (except for sign).

My question is: Why, using the survival package in R, do coxph() and survreg(, dist="exponential") NOT return the same coefficient estiamtes (in absolute value)? Below is the code I used to check this. I first simulate uncensored data using the exponential AFT parametrization and, as expected, survreg() returns 'on average' d1=0.5 (the value I used to simulate the data). So far so good, However, the coefficient returned by coxph is quite different and seems to depend very much on the scale.lambda parameter of the exponential distribution. How does this mismatch relate to the theoretical equivalence (in absolute value) of the coefficients of the exponential AFT model and the Cox PH model? What am I missing?

library(survival)
  # sample size
n            <- 1000
scale.lambda <- 2
d1           <- 0.5
ph.ests      <- c()
aft.ests     <- c()
X1           <- c(rep(0,n/2),rep(1,n/2))
for(i in 1:100){
 e        <- rexp(n,scale.lambda)
   # AFT model
 logT     <- -d1*X1 + e
 T        <- exp(logT)
   # no censoring
 event    <- rep(1,n) 
   # cox ph analysis
 m.ph     <- coxph(Surv(T,event)~X1)
   # exponential AFT analysis
 m.aft    <- survreg(Surv(T,event)~X1,dist="exponential")
   # collect X1 coefficients (on log scale)
 ph.ests  <- c(ph.ests,coef(m.ph))
 aft.ests <- c(aft.ests,coef(m.aft)[2])
}
  # summary of 100 replications
summary(aft.ests)
summary(ph.ests)

Best Answer

It has just been pointed out to me that I made a mistake in my simulation. So my theoretically based assumption was sound: under the exponential AFT model coxph() and survreg(,dist="exponential") should provide on average the same answer. However my implementation of the exponential AFT model in the R simulation was incorrect. I simulated:

logT     <- -d1*X1 + e
T        <- exp(logT)

However, under my assumptions, it is not logT that is exponentially distributed, but T. So instead, I should have replaced the above two lines in my code with:

T     <- exp(-d1*X1)*e

After this change, the code produced the results I expected.