Solved – how to work with time-dependent data in Lasso Cox regression in glmnet R package

cox-modelglmnetrsurvivaltime series

I am trying to reproduce this study: http://stm.sciencemag.org/content/7/299/299ra122

I have time-dependent features like patient lab values and vital signs measurements, and also features like age, gender, etc that don't change in time. The paper says "we fit a Cox proportional hazards model using the time until the onset of septic shock as the supervisory signal". And "time-to-event models were learned as a Cox proportional hazards model with lasso regularization (glmnet R package, version 1.9-8"

I am new to survival analysis. I have done a lot of research but have not found one that has used glmnet cox regression with time dependent data. The only example I found is this:
https://github.com/cran/glmnet/blob/master/inst/doc/Coxnet.R
The data (patient.data) columns are not explained and do not seem to be time dependent.

To feed my data to glmnet package, I know I should have a matrix x (one row for each patient and one column for each feature) and a matrix y (one column for each patient,one row for time of event for each patient and one row for status of event)

My question is, imagine I have 50 patients, and my features are age, gender, type of disease and blood pressure (measured every four hours for a month) and a lab value (measured every day for a month), what should my matrix x look like?

Best Answer

Is there a specific reason why you are using penalized?

It strikes me that with multiple ids per row it makes sense to use cluster() in the survival package and denote the patient the observation belongs to.

Related Question