Solved – Can a Cox Proportional-Hazards Model be built only with continuous predictors

cox-modelpredictionrregressionsurvival

The literature on Survival Analysis is mainly from the Medical science where tipically the researcher want to evaluate the effect of a treatment to that of another one. So far, all the example I read and studied thus contain one or more categorical variable (with at least 2 levels) and possibly some continuous variable as a covariate. Anyway the main interest is on a categorical variable (e.g. treatment).

Is it possible and correct to run a non parametric Cox model (or alternatively a parametric one) using only one or more continuous variables? In particular without categorizing the continuous var into 2 or more groups?

Something like a logistic regression.

To give you a more practical example, I'm trying to model the survival of say bush in a field depending on the number of cows in the same field.

I'm pretty sure it can be done but the lack of examples leave me in the doubt.

If possible how can one use the predict function for example to predict the survival when the predictor has a specified value? like survival of my plant when 10 cows are in the field…

any help is welcome!

Best Answer

Either a fully parametric survival model (e.g., survreg in R) or a Cox* proportional hazards semi-parametric regression (e.g., coxph in R) is fully capable of handling continuous variables as predictors. If you have a continuous predictor this is the preferred approach over breaking the continuous predictor into categories. You may need a transformation of the continuous predictor to meet the linearity and proportional hazards assumptions, but that is not really different from any regression. Using the predict function for a coxph or survreg fit in R is no trickier for a continuous predictor than for a categorical predictor (although there can be a learning curve if you are trying survival predictions for the first time).


*For clarity in terminology, a Cox proportional hazards regression model of survival is semi-parametric, in that there is are no parameters characterizing the baseline hazard but the coefficients are parameters whose values are determined by the regression.

A fully parametric survival model specifies a particular functional form for the hazard or survival function along with parameters describing the influence of predictors on survival. A truly non-parametric model (log-rank test) simply looks for differences between survival curves without trying to estimate parameters of either the baseline hazard or of the effects of the predictor variable. To avoid confusion, these last two types of models are perhaps best not called "Cox" models (even though the log-rank test is sometimes called the Mantel-Cox test), reserving that name for the semi-parametric regression case.