Survival Analysis – Selection of Patients in Retrospective Studies

cox-modelkaplan-meiersurvival

I had some questions about selection of patients and their time period for a retrospective survival analysis.

Let's say we want to perform survival analysis retrospectively on patients who had disease X during 2015 to 2019 and look at their survival at the end of 2020. Technically we could do survival with follow-up of up to 5 years because some patients will have follow-up of up to 5 years. But we believe that this would be incorrect because for the later patients, the follow-up is truncated (only 1 year of follow-up was allowed for patients in 2019) and censoring these patients at 2020 is not correct.

So if we include patients from 2015 to 2019, we can look at 1 year survival in 2020 because every patient has been allowed a theoretical follow-up of 1 year. If we want to check 2 year survival, we should only include patients from 2015 to 2018 (both censored and complete follow-up patients in this period). For 3 year survival, it should be period 2015 to 2017 and so on.

Then, we came across this opinion, that the above would be correct if we wanted to calculate the cumulative incidence/survival probability % for different groups in this cohort (so it would be accurate to get survival probability at 1 year for 2015-2019 patients, but 3 year estimates for 2015-2017 patients only). But if the aim is to look only at difference in survival (using log-rank test or HR in cox model) but not incidence/survival probability percentage, then all 2015-2019 patients could be included in survival analysis for all time periods.

What would be correct design- if all patients from 2015-2019 be included, could we do 1, 3, 5 year survival at 2020 or should we limit it to 1 year? Or the design changes according to the aim as mentioned above.

Best Answer

You should be able to use all of your patients for the study. I assume that the entry dates represent dates of diagnosis or something similar, and that your interest is in survival following that entry date.

Yes, losing a patient to follow up before the study end date in 2020 is a different type of censoring than is censoring due to the end of the study. But those are both forms of censoring that can be accounted for, with care, in survival analysis. In this type of situation, the end of the study is not considered a "truncation" in the technical sense. See for example Klein and Moeschberger for distinctions between truncation and censoring.

Censoring due to loss to follow up during the study period is more troublesome than censoring due to the end of the study. The former runs a risk of being informative censoring, in which loss to follow up is itself associated with some other survival-associated variable. In contrast, the date of the end of the study is simply an administrative choice that should bear no association with survival per se. Leung et al describe different types of censoring in some detail.

If you include all of your patients from 2015 through 2019 in the study and perform a Kaplan-Meier analysis or a survival regression model with appropriate censoring indicators, the patients enrolled in 2019 will be able to contribute important survival information for that period of 1 year or so that elapsed between diagnosis and the end of the study.

Only those diagnosed from 2015 to 2017 would contribute to the data used beyond the 3-year survival time point, as you indicate in the question. But the survival estimates for those later times build on the estimates for earlier times, given the nature of Kaplan-Meier or regression models. So including those 2019 patients in the model provides better precision for the shorter-time survival estimates and thus also for the estimates of longer survival.

You should watch out for possible informative censoring, and there might be problems if there was a systematic change in post-diagnosis survival as a function of diagnosis year, for example if there was a major improvement in therapy during the study period. But absent such problems there is no reason to discard any of your patients from your study cohort based on the administrative censoring at the end of the study.

Related Question