$S(t|x_{1})=\exp(-\Lambda(t|x_{1}))$ where $\Lambda(t|x_{1})=\int_{0}^{t}\lambda(u|x_{1})du$, here $x_{1}=1$ for group 1 and $0$ if not. For a hazard function modlled as $\lambda(t|x_{1})=\lambda_{0}(t)\exp(\beta_{1}x_{1})$ the hazard ratio $r$ is defined as
\begin{align*}
r &= \frac{\lambda(t|x_{1}=1)}{\lambda(t|x_{1}=0)}\\
&=\frac{\lambda_{0}(t)\exp(\beta_{1})}{\lambda_{0}(t)\exp(0)}\\
&=\frac{\exp(\beta_{1})}{1}\\
&=\exp(\beta_{1})
\end{align*}
Thus
\begin{align*}
\Lambda(t|x_{1}=1)&=\int_{0}^{t}\lambda(u|x_{1}=1)du\\
&=\int_{0}^{t}\lambda_{0}(t)rdu\\
&=r\int_{0}^{t}\lambda(u|x_{1}=0)du\\
&=r\Lambda(t|x_{1}=0)
\end{align*}
Accordingly
\begin{align*}
S(t|x_{1}=1)&=\exp(-\Lambda(t|x_{1}=1))\\
&=\exp(-r\Lambda(t|x_{1}=0))\\
&=\frac{1}{r\exp(\Lambda(t|x_{1}=0))}\\
&=\left[\frac{1}{\exp(\Lambda(t|x_{1}=0))}\right]^{r}\\
&=\left[\exp(-\Lambda(t|x_{1}=0))\right]^{r}\\
&=\left[S(t|x_{1}=0)\right]^{r}\\
\end{align*}
So yes $r=HR$. The deriavation of $S(t|x_{1}=1)=\left[S(t|x_{1}=0)\right]^{r}$ above depedned on the defintion of the HR and the model form $\lambda(t|x_{1})=\lambda_{0}(t)\exp(\beta_{1}x_{1})$. Since no time-varying covariates are in the exponential term the HR is constant over time - i.e. proportional hazards since $\lambda(t|x_{1}=1)=r\lambda(t|x_{1}=0)$ shows hazard of $\lambda(t|x_{1}=1)$ is proportional to hazard of $\lambda(t|x_{1}=0)$ with time constant multiplicative factor $r=\exp(\beta_{1})$. Finally noting that $Pr[T>t|x_{1}]=S(t|x_{1})$
\begin{align*}
Odds(x_{1}=1)&=\frac{Pr[T>t|x_{1}=1]}{1-Pr[T>t|x_{1}=0]}\\
&=\frac{S(t|x_{1}=1)}{1-S(t|x_{1}=1)}\\
&=\frac{\exp(-r\Lambda(t|x_{1}=0))}{1-\exp(-r\Lambda(t|x_{1}=0))}\\
&=\frac{1}{\exp(r\Lambda(t|x_{1}=0))-1}\\
&=\frac{2}{\exp(r\Lambda(t|x_{1}=0))}
\end{align*}
and
\begin{align*}
Odds(x_{1}=0)&=\frac{1}{\exp(\Lambda(t|x_{1}=0))-1}\\
&=\frac{2}{\exp(\Lambda(t|x_{1}=0))}
\end{align*}
So
\begin{align*}
\frac{Odds(x_{1}=1)}{Odds(x_{1}=0)}&=\frac{\exp(\Lambda(t|x_{1}=0))}{\exp(r\Lambda(t|x_{1}=0))}\\
&=\exp[\Lambda(t|x_{1}=0)(1-r)]\\
&=\exp[-\Lambda(t|x_{1}=0)(r-1)]
\end{align*}
This implies
\begin{align*}
r=\frac{\Lambda(t|x_{1}=0)-\log(OR)}{\Lambda(t|x_{1}=0)}
\end{align*}
Edit: I am not so sure about $r=p/1-p$ for some $p$, since if $p$ is a probability this looks like an odds rather than an odds ratio (OR). The only thing I can think of is the following: assuming in the expression $\exp[-\Lambda(t|x_{1}=0)(r-1)]$ that $\Lambda(t|x_{1}=0)(r-1)$ is "small" then using $e^{x}\approx 1-x$ then $\exp[-\Lambda(t|x_{1}=0)(r-1)]\approx 1-\Lambda(t|x_{1}=0)(r-1)$ so that
\begin{align*}
OR &\approx 1-\Lambda(t|x_{1}=0)(r-1)\\
\Longleftrightarrow & r = \frac{\Lambda(t|x_{1}=0)+1-OR}{\Lambda(t|x_{1}=0)}
\end{align*}
We see the above is equal to $p/1-p$ for $p=\Lambda(t|x_{1}=0)+1$ if $OR=1$. Thus if $\Lambda(t|x_{1}=0)(r-1)$ is "small" and OR=1 then the HR is just the odds of an event for group 0.
If the proportional hazard assumption holds, then in principle the choices of reference or 0 values for predictors $x$ don't matter. You could re-write the formula you provided for the hazard as:
$$h(t|x(t)) = h_0(t)\exp(-\bar x' \beta) \exp(x(t)'\beta)= h_{0\bar x}(t) \exp(x(t)'\beta),$$
a constant multiplicative scaling of the original baseline hazard that will then work with the un-centered predictor values. Any re-centering of predictor variables will just mean a corresponding shift in the corresponding baseline hazard function, which isn't even directly evaluated by the Cox model.
In practice, the exponential can lead to numerical instability. The help page for the R coxph()
function says:
The routine internally scales and centers data to avoid overflow in the argument to the exponential function. These actions do not change the result, but lead to more numerical stability.
I suspect that the lifelines
implementation centers to avoid that practical problem, with the equation written to show that centering explicitly. I don't know whether it also scales internally.
The coefficients $\beta$ in Cox model are found by maximizing the partial likelihood of the data as a function of the coefficient values. This page shows the form of the partial likelihood and how it takes censoring into account. You solve by finding coefficient values that bring the first derivatives of the log partial likelihood with respect to the coefficients, the score function, to 0. This answer shows the form of the score equation for a Cox model, although $\bar x$ in that formula takes on a different meaning as a risk-weighted average of predictor values in place at an event time.
Modeling Survival Data: Extending the Cox Model by Therneau and Grambsch goes into extensive detail.
Best Answer
Subtracting the mean from the covariate values can help in fitting a Cox model, as otherwise the exponentiations can lead to overflow. I recall that the R
coxph()
function internally mean-centers and standardizes (to unit standard deviation) all continuous covariates for that reason, even though it reports coefficients appropriate to the original scales of the covariates.In the formula with the mean subtracted, you can factor out the constant terms associated with the mean covariate values into the baseline hazard:
$$ h(t | x) = b_0(t) \exp \left(\sum_{i=1}^n b_i (x_i - \overline{x_i})\right)\\ =\left(\frac{b_0(t)} {\exp \sum_{i=1}^n b_i ( \overline{x_i})}\right)\exp \left(\sum_{i=1}^n b_i (x_i)\right).$$
Thus there's no change in the modeled coefficients, just in the definition of the baseline hazard.
The important "partial" terminology has to do with the "partial likelihood" that a Cox model optimizes to estimate coefficient values. Technically, a likelihood is (proportional to) the probability of observing the data given a set of parameter values. In a Cox model the actual observation times aren't modeled directly, so you don't model the probability of the data per se. The contribution of Cox was to recognize that, if you were willing to make a proportional-hazards assumption, you don't need to model the actual observation times and you can factor out the baseline hazard to start. What's left is then called the "partial likelihood" of the data given the Cox regression coefficients.
The "partial hazard" and "log-partial hazard" terminology isn't uniformly used in books on survival analysis; at least, it didn't show up in a quick search of a few electronic texts that I have on hand, including the classic text by Therneau and Grambsch on Cox models. It might be intended to emphasize the partial-likelihood basis of the coefficient estimates. I wouldn't worry too much about that terminology.
The hazard ratio is simply the ratio of two hazards. It's often represented for an individual having a set of covariate values with respect to the baseline hazard, as in your first examples, but in general you can calculate a hazard ratio between any two sets of covariate values that are included in a Cox model.