Cox Model in Python – Calculating Time-Varying Covariate Coefficients in Cox Hazard Model

cox-modelpythonsurvivalweibull distribution

The cox time varying covariates (x(t)) model is as such:

The above formulation can be seen here: https://lifelines.readthedocs.io/en/latest/fitters/regression/CoxTimeVaryingFitter.html

Here, can anyone please let me know the following two things:

What is '$\bar{x}$' in the above formula?
How are the coefficients β being calculated?

Edit-1:

In method 1, we calculate the coefficient values that bring the first derivatives of the log partial likelihood with respect to the coefficients, the score function, to 0 as depicted in this book and shown in below image file:

In method 2, by using the Hessian matrix, the partial likelihood is maximized via Newton-Raphson algorithm. The inverse of the Hessian matrix, evaluates β as mentioned here and shown in below image file:

In method-3, partial likelihood is maximized via Nelder Mead’s algorithm to calculate β as mentioned here and shown in below attached file:

Can somebody please let me know what kind of optimization algorithm does this library use in Cox model to calculate β.

Best Answer

If the proportional hazard assumption holds, then in principle the choices of reference or 0 values for predictors $x$ don't matter. You could re-write the formula you provided for the hazard as:

$$h(t|x(t)) = h_0(t)\exp(-\bar x' \beta) \exp(x(t)'\beta)= h_{0\bar x}(t) \exp(x(t)'\beta),$$

a constant multiplicative scaling of the original baseline hazard that will then work with the un-centered predictor values. Any re-centering of predictor variables will just mean a corresponding shift in the corresponding baseline hazard function, which isn't even directly evaluated by the Cox model.

In practice, the exponential can lead to numerical instability. The help page for the R coxph() function says:

The routine internally scales and centers data to avoid overflow in the argument to the exponential function. These actions do not change the result, but lead to more numerical stability.

I suspect that the lifelines implementation centers to avoid that practical problem, with the equation written to show that centering explicitly. I don't know whether it also scales internally.

The coefficients $\beta$ in Cox model are found by maximizing the partial likelihood of the data as a function of the coefficient values. This page shows the form of the partial likelihood and how it takes censoring into account. You solve by finding coefficient values that bring the first derivatives of the log partial likelihood with respect to the coefficients, the score function, to 0. This answer shows the form of the score equation for a Cox model, although $\bar x$ in that formula takes on a different meaning as a risk-weighted average of predictor values in place at an event time.

Modeling Survival Data: Extending the Cox Model by Therneau and Grambsch goes into extensive detail.

Related Solutions

Solved – Weibull MLE: what is the method/algorithm used to perform the optimization

It's BFGS
I'm not too sure, but the basic idea is to do Newton's method, but use an approximation to the inverse of the Hessian instead of the true inverse of the Hessian. There is a good explanation in these notes (pdf) but I am not sure that it is enough to do an implementation. However, there do seem to be a lot of references on BFGS. It even has a Facebook page.

Cox Model – How to Compute Partial Log-Likelihood Function

This is technically a programming question with an easy programming answer. If you simply want the partial likelihood, why not fool R into giving it to you? Simply initialize beta and allow no iterations, then extract the loglik value from the coxph object. (see ?coxph.object).

For example:

## artificial data
library(survival)
n <- 1000
t <- rexp(100)
c <- rbinom(100, 1, .2) ## censoring indicator (independent process)
x <- rbinom(100, 1, exp(-t)) ## some arbitrary relationship btn x and t
betamax <- coxph(Surv(t, c) ~ x)
beta1 <- coxph(Surv(t, c) ~ x, init = c(1), control=list('iter.max'=0))

With example output:

> betamax$loglik
[1] -68.62548 -65.99652
> beta1$loglik
[1] -66.10908 -66.10908

You can even define a wrapper:

loglik <- function(beta, formula) {
  formula, init=beta, control=list('iter.max'=0))$loglik[2]
}

betas <- seq(0, 2, by=0.01)
logliks <- sapply(betas, loglik, Surv(t, c) ~ x)
plot(betas, logliks)
abline(v=betamax$coefficients)

Best Answer

Related Solutions

Solved – Weibull MLE: what is the method/algorithm used to perform the optimization

Cox Model – How to Compute Partial Log-Likelihood Function

Related Question