How are the cumulative incidence functions calculated for different treatment groups

survival

I'm wondering how the cumulative incidence function for a single event type is computed when you specify that you want to get separate curves for, for example, treatment vs placebo. One curve ends up above the original curve (generated independent of treatment group) and one below, whereas I thought they would both be below and add up to the original curve. What is actually happening? I guess it is the same when you do this with a Kaplan-Meier curve as well?

Best Answer

What you expect is what you would get if you were showing the total number of events in the cohort over time. Then, yes, the events in two treatment groups should add up to the total number of events.

That type of display isn't usually helpful, for a couple of reasons. First, it doesn't take censoring into account properly. Second, it makes it hard to compare against other data, as it depends on the overall sample size.

If there's only one event type that can happen at most once to an individual, what's of primary interest is typically the probability of survival over time. Probabilities are bounded by 0 and 1. That's what you get with a Kaplan-Meier plot. If 2 treatment groups differ in probability over time, then their individual survival curves will necessarily be on opposite sides of the combined survival curve. The group with higher-than-average survival at any time will be above the combined curve, the other one below.

With multiple events of the same type possible, which seems to be what you have in mind, the usual display is the estimated number of events per individual over time. See the plot at the end of Section 3.2 of the main R survival vignette. The same principle thus applies: with 2 treatment groups, the group with a higher number of estimated events per individual will be plotted higher than the overall estimate for the cohort, and the other group below.

Related Question