Solved – interval censored survival analysis with time dependent covariates

censoringcox-modelsurvivaltime-varying-covariate

I'm working on a long-term, large tree data set from Africa. I have data on the same set of individuals from year 2006, 2008, 2011 and 2015. The data consist of tree status (alive/dead) at each time period and covariates such as elephant damage (type and proportion) and fire damage (type and proportion), rainfall, species, crown height etc. I would like to use the data from 2006-2011 to predict mortality in 2015, based on the values of covariates for 2011-2015.

Since the data includes censored individuals, I felt I should apply a survival analysis technique such as a Cox PH regression. However, the data seems to be interval censored, since exact time of death is not observed. I just know that death occurred before 2006, between 2006 and 2008 or 2008 and 2011. I also have time-dependent covariates such as proportion elephant damage for each census interval.

Is there an R package for conducting interval-based CoxPH with time dependent covariates?
Is there some other analysis framework that would be better suited to this dataset and my question, such as a mixed-effects logistic regression with time-dependent covariates?

I am new to survival analysis. Any guidance would be greatly appreciated!

Best Answer

Semi-parametric models such as Cox regression are not easily applied in the presence of interval censoring. In this situation the default choice is to use parametric models.

On the other hand, time-dependent covariates are easily included in a right-censored setting and not with interval censoring. To include time-dependent covariates in an interval censoring scenario few methods have been proposed, and this is probably the best available approach. However, this method (as far as I know) has only been implemented in SAS. The Weibull model is among the possible methods they discuss, so if you are interested in a proportional hazard model you will be fine.

For your analyses I see three options:

  1. Choose this latter approach based on parametric survival, which would be ideal for your data, and use SAS to fit your models.
  2. If you prefer to use R, make an assumption on when events are occurring within the interval (normally distributed for example), and do a classical right-censored analysis using Cox regression (in this case you can refer to this R tutorial)
  3. Seek alternative approaches such as the one you mentioned based on logistic regression.
Related Question