Solved – the parameterization of exponential distribution for survival in Stata

exponential distributionparameterizationrstatasurvival

I'm new to data analysis so this is kind of a simple question.

I would like to understand why I cannot reproduce a survival curve generated by a fitted exponential model from Stata. I use the coefficients and make my function in R to plot but it looks nothing similar. I believe it's one of those daft problems where I am not interpreting something properly. I illustrate below.

First, some data in Stata:

use http://www.ats.ucla.edu/stat/data/uis.dta, clear
gen id = ID
drop ID
stset time, failure(censor)

Then we can fit a null exponential model

streg, dist(exponential) nohr

Which gives the following output:

Exponential regression -- log relative-hazard form 

No. of subjects =          628                     Number of obs   =       628
No. of failures =          508
Time at risk    =       147394
                                                   LR chi2(0)      =     -0.00
Log likelihood  =    -1043.531                     Prob > chi2     =         .

------------------------------------------------------------------------------
          _t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |  -5.670383   .0443678  -127.80   0.000    -5.757342   -5.583424
------------------------------------------------------------------------------

I take the survivor function from Stata's documentation:

$$
S(t)=\exp(-\lambda_{j} t_{j})
$$

So, in R, I plot this out with the following:

S <- function(x) exp(-5.670383*x) # 'x' acts like 't'
curve(S, 0, 1000)

This curve is not equivalent to Stata's, given by:

stcurve, surv

Where did I go wrong in my interpretation? Is my equation using the correct parameterization?

P.S. Why am I reproducing these curves? A little to do with overlaying curves but now that I have this problem, I have to know where I went wrong.

Best Answer

From the documentation you can see that $\lambda = \exp(xb)$, so

S <- function(x) exp(-5.670383*x)

should have been:

S <- function(x) exp(-exp(-5.670383)*x)

As an aside, I would stick to Stata for such graphs, mainly because switching programs within a project makes it harder to document that project. Here is how I would create such a graph in Stata:

twoway function survival = exp(-exp(_b[_cons])*x), ///
     range(0 <some reasonalble maximum>)           ///
     xtitle("time")

Moreover, you can replace

gen id = ID
drop ID

with

rename ID id
Related Question