If you know the parametric distribution that your data follows then using a maximum likelihood approach and the distribution makes sense. The real advantage of Cox Proportional Hazards regression is that you can still fit survival models without knowing (or assuming) the distribution. You give an example using the normal distribution, but most survival times (and other types of data that Cox PH regression is used for) do not come close to following a normal distribution. Some may follow a log-normal, or a Weibull, or other parametric distribution, and if you are willing to make that assumption then the maximum likelihood parametric approach is great. But in many real world cases we do not know what the appropriate distribution is (or even a close enough approximation). With censoring and covariates we cannot do a simple histogram and say "that looks like a ... distribution to me". So it is very useful to have a technique that works well without needing a specific distribution.
Why use the hazard instead of the distribution function? Consider the following statement: "People in group A are twice as likely to die at age 80 as people in group B". Now that could be true because people in group B tend to live longer than those in group A, or it could be because people in group B tend to live shorter lives and most of them are dead long before age 80, giving a very small probability of them dying at 80 while enough people in group A live to 80 that a fair number of them will die at that age giving a much higher probability of death at that age. So the same statement could mean being in group A is better or worse than being in group B. What makes more sense is to say, of those people (in each group) that lived to 80, what proportion will die before they turn 81. That is the hazard (and the hazard is a function of the distribution function/survival function/etc.). The hazard is easier to work with in the semi-parametric model and can then give you information about the distribution.
It is not clear to me how you generate your event times (which, in your case, might be $<0$) and event indicators:
time = rnorm(n,10,2)
S_prob = S(time)
event = ifelse(runif(1)>S_prob,1,0)
So here is a generic method, followed by some R code.
Generating survival times to simulate Cox proportional hazards models
To generate event times from the proportional hazards model, we can use the inverse probability method (Bender et al., 2005): if $V$ is uniform on $(0, 1)$ and if $S(\cdot \,|\, \mathbf{x})$ is the conditional survival function derived from the proportional hazards model, i.e.
$$
S(t \,|\, \mathbf{x}) = \exp \left( -H_0(t) \exp(\mathbf{x}^\prime \mathbf{\beta}) \vphantom{\Big(} \right)
$$
then it is a fact that the random variable
$$
T = S^{-1}(V \,|\, \mathbf{x}) = H_0^{-1} \left( - \frac{\log(V)}{\exp(\mathbf{x}^\prime \mathbf{\beta})} \right)
$$
has survival function $S(\cdot \,|\, \mathbf{x})$. This result is known as ``the inverse probability integral transformation''. Therefore, to generate a survival time $T \sim S(\cdot \,|\, \mathbf{x})$ given the covariate vector, it suffices to draw $v$ from $V \sim \mathrm{U}(0, 1)$ and to make the inverse transformation $t = S^{-1}(v \,|\, \mathbf{x})$.
Example [Weibull baseline hazard]
Let $h_0(t) = \lambda \rho t^{\rho - 1}$ with shape $\rho > 0$ and scale $\lambda > 0$. Then $H_0(t) = \lambda t^\rho$ and $H^{-1}_0(t) = (\frac{t}{\lambda})^{\frac{1}{\rho}}$. Following the inverse probability method, a realisation of $T \sim S(\cdot \,|\, \mathbf{x})$ is obtained by computing
$$
t = \left( - \frac{\log(v)}{\lambda \exp(\mathbf{x}^\prime \mathbf{\beta})} \right)^{\frac{1}{\rho}}
$$
with $v$ a uniform variate on $(0, 1)$. Using results on transformations of random variables, one may notice that $T$ has a conditional Weibull distribution (given $\mathbf{x}$) with shape $\rho$ and scale $\lambda \exp(\mathbf{x}^\prime \mathbf{\beta})$.
R code
The following R function generates a data set with a single binary covariate $x$ (e.g. a treatment indicator). The baseline hazard has a Weibull form. Censoring times are randomly drawn from an exponential distribution.
# baseline hazard: Weibull
# N = sample size
# lambda = scale parameter in h0()
# rho = shape parameter in h0()
# beta = fixed effect parameter
# rateC = rate parameter of the exponential distribution of C
simulWeib <- function(N, lambda, rho, beta, rateC)
{
# covariate --> N Bernoulli trials
x <- sample(x=c(0, 1), size=N, replace=TRUE, prob=c(0.5, 0.5))
# Weibull latent event times
v <- runif(n=N)
Tlat <- (- log(v) / (lambda * exp(x * beta)))^(1 / rho)
# censoring times
C <- rexp(n=N, rate=rateC)
# follow-up times and event indicators
time <- pmin(Tlat, C)
status <- as.numeric(Tlat <= C)
# data set
data.frame(id=1:N,
time=time,
status=status,
x=x)
}
Test
Here is some quick simulation with $\beta = -0.6$:
set.seed(1234)
betaHat <- rep(NA, 1e3)
for(k in 1:1e3)
{
dat <- simulWeib(N=100, lambda=0.01, rho=1, beta=-0.6, rateC=0.001)
fit <- coxph(Surv(time, status) ~ x, data=dat)
betaHat[k] <- fit$coef
}
> mean(betaHat)
[1] -0.6085473
Best Answer
The estimation of life-expectancy in the presence of censored data necessarily requires assumptions on the unobserved part of the survival function. A parametric distribution can be used for extrapolation of observed and expected survival, but it is not easy to capture the shape of the unobserved survival function. A possible approach is to use relative survival (see for instance: Andersson et al. 2012)
If you wish to avoid data extrapolation, this can be done by evaluating survival percentiles. With a sufficient number of cases you can estimate the median survival, while with a proportion of cases below 50% lower percentiles can be estimated.
Survival percentiles can be calculated from the Kaplan-Meier estimator, which summarizes the observed survival. If you are interested in adjusted survival percentiles you may take a look at Laplace regression (see for instance Orsini et al. 2012).