Solved – Difference between Cox regression and logistic regression; question about correlation assessment

cox-modellogisticpearson-rregressionspearman-rho

  1. What is the difference between Cox regression and a logistic regression? I'm writing my own thesis and I have to choose between these two.
  2. Do I have to assess the covariance between the variables I want to put as covariates in the regression? If so, do I have to use Pearson or Spearman test?

Best Answer

1) A logistic regression calculates the probability of an event happening based on the factors you feed into your model, and it uses a logit transform to give you those probabilities. (I will assume that you know this type of regression quite well so I will not go too much into it).

A Cox regression (or Cox Proportional Hazard model) is quite different. It is used to explore the relationship between the 'survival' of a subject and the explanatory variables. It operates like a linear regression except that the response variable $Y$ is the hazard function at a given time $t$. The model takes the form:

$(Y = )\lambda_i(t) = \lambda_0(t)exp(\beta^TX_i)$

where $\lambda_0(t)$ is the baseline hazard which is analogousto the intercept term in linear regression, it corresponds to the probability of your 'event' occurring when all of the explanatory variables are zero. The explanatory variables and regression coefficients are in the form of an exponential function $exp(\beta^TX_i)$ where $\beta$ are the coefficients and $X_i$ are the explanatory variables for an person $i$.

The reasoning behind the proportionality of hazards in this model is the assumption of the consistent relationship between the dependent and explanatory variables, this means that the hazard functions for any two individuals at any point in time are proportional, for example if subject A has a risk of 'event' twice as high as another subject B at time $t$, then subject A will maintain that level of proportionality at all later times $t$.

As you can see, unlike logistic regression, this model is dependent on time, which means the hazard of an 'event' happening changes with time.