I am running a multi-variate Cox regression and Stata provides the standard errors for each hazard ratio. How are theses to be interpreted? I know that I want my coefficients to be large compared to my SEs, but I don't know if the same rule applies to ratios.
Solved – How to interpret standard errors in a Cox model
cox-modelhazardstandard errorstatasurvival
Related Solutions
The value of the ratio can be obtained as follows, noting that the discrete sampling in lines 3 and 4 of the code implies 3 options for $X1$, 3 for $X2$, and 7 for $Y$ conditional on each combination of $X1$ and $X2$:
$(N – K)_{old}$ = 1000 – 3 = 997
$(N – K)_{new}$ = (3 x 3 x 7) – 3 = 63 – 3 = 60
Ratio of standard errors = $\sqrt\frac{(N – K)_{new}}{(N – K)_{old}} = \sqrt\frac{60}{ 997} = 0.2453172$
Why is $(N – K)$ relevant here? Because it is the denominator in the formula for the OLS estimator $s^2$ of the variance parameter $\sigma_0^2$:
$s^2 = \frac{SSR}{N – K}$
where SSR is the sum of squared residuals. This in turn feeds into the estimator of the variance of the coefficient vector $\sigma_0^2$, with $X$ as the matrix of independent variables:
$V[\hat\beta| X] = \sigma_0^2.(X’X)^{-1}$ estimated by $s^2.(X’X)^{-1}$
and so into the standard errors of the coefficients. These formulae also apply to the WLS coefficients provided that SSR and X are based on the weighted variables (WLS being equivalent to OLS on weighted variables).
However, it is important to note (especially for anyone who may be using regression with aggregate data in real applications such as in economics) that the simple ratio formula above only works because of particular features of this case. In general the effect of aggregation on standard errors is more complex, with effects via the SSR and $X’X$ (which happen to cancel out in this case) needing to be considered as well as those via $(N – K)$.
A relevant feature of this case is that aggregation does not group together different $Y$ values, each combination of $X1$, $X2$ and $Y$ forming a separate aggregation. Thus there is no averaging of $Y$ values which would tend to reduce the residuals. Suppose, by contrast, that a sample has two observations of $y$ for each observed $x$ value and that in each case the two observations happen to lie on opposite sides of the fitted line, but the same distance from it. Then regression on the unaggregated data will produce non-zero standard errors of the coefficients, but aggregation at each $x$ value (that is, averaging of its two $y$ observations) will produce a perfect fit with zero residuals and therefore zero standard errors. In that case, therefore, the zero SSR in the aggregate model will dominate any effects via $(N – K)$ and $X’X$.
an odds ratio of 2 means that the event is 2 time more probable given a one-unit increase in the predictor
It means the odds would double, which is not the same as the probability doubling.
In Cox regression, a hazard ratio of 2 means the event will occur twice as often at each time point given a one-unit increase in the predictor.
Aside a bit of handwaving, yes - the rate of occurrence doubles. It's like a scaled instantaneous probability.
Are these not practically the same thing?
They're almost the same thing when doubling the odds of the event is almost the same as doubling the hazard of the event. They're not automatically similar, but under some (fairly common) circumstances they may correspond very closely.
You may want to consider the difference between odds and probability more carefully.
See, for example, the first sentence here, which makes it clear that odds are the ratio of a probability to its complement. So for example, increasing the odds (in favor) from 1 to 2 is the same as probability increasing from $\frac{1}{2}$ to $\frac{2}{3}$. Odds increase faster than probability increases. For very small probabilities, odds-in-favor and probability are very similar, while odds-against become increasingly similar to (in the sense that the ratio will go to 1) reciprocals of probability as probability gets small. An odds ratio is simply the ratio of two sets of odds. Increasing the odds ratio while holding a base odds constant corresponds to increasing the other odds, but may or may not be similar to the relative change in probability.
You may also want to ponder the difference between hazard and probability (see my earlier discussion where I make mention of hand-waving; now we don't gloss over the difference). For example, if a probability is 0.6, you can't double it – but an instantaneous hazard of 0.6 can be doubled to 1.2. They're not the same thing, in the same way that probability density is not probability.
Best Answer
The formula for the Cox proportional hazards is:
$$h(t)=h_0(t)e^{\beta_1*x_1+...+\beta_n*x_n}$$
All $\beta$ are thus independent of the baseline hazard, the $h_0(t)$, allowing the comparison between different hazard ratios. For instance, if we have two treatment arms one with placebo ($X_{treat}=0$) and one with active substance ($X_{treat}=1$) where we also adjust for sex, we get:
$$HR=h_{treated}(t)/h_{placebo}(t)=\frac{h_0(t)e^{\beta_{treat}*1+\beta_{sex}*x_{sex}}}{h_0(t)e^{\beta_{treat}*0+\beta_{sex}*x_{sex}}} = e^{\beta_{treat}*1-\beta_{treat}*0+\beta_{sex}*x_{sex}-\beta_{sex}*x_{sex}}=e^{\beta_{treat}}$$
Note that the $\beta$ is in exponential format, thus the SE for the $\beta$ is also in the exponential form. When you compare the coefficient with the SE you need to do this with the $\beta$ in the logarithmic form. I would refrain from using the SEs for anything else than for confidence intervals/p-values.