Solved – Schoenfeld residuals and Categorical variables with multiple categories

rschoenfeld-residualssurvival

I am having a bit of trouble with my survival data.
Basically I am trying to assess whether or not the cox proportional hazard assumption is met, by ploting the schoenfeld residuals in R.

I have several variables and some of them have 3 or more categories.
As far as I know, Schoenfeld residuals are adjusted for each individual and each variable. So, when I tried to get the residuals for each of this variables with 3 or more categories, I was expecting to have one residual for individual, but instead I got more.

It's my first time using R with survival data, so probably I am doing something wrong. Here is the code I used with a variable that has 4 categories.

>fit=survfit(s~data$school)

#Cox Model
>summary(coxph(s~data$school))

#Test for proportional hazards assumption
>cox.zph(coxph(s~data$school))

#Schoenfeld residuals (defined for each variable for uncensored subjects)
>res=residuals(coxph(s~data$school), type="schoenfeld", collapse=F)

When I call the first 3 rows (time = 7 days), I get a set of 3 residuals per individual:

> res[1:3,]

       data$school2     data$school3     data$school4
7        -0.6250424       -0.1333578        0.8462863
7        -0.6250424       -0.1333578        0.8462863
7         0.3749576       -0.1333578       -0.1537137

Can anyone please help me we this?

Best Answer

There is a separate Schoenfeld residual for each covariate for every uncensored individual.

In the case of a categorical covariate with $k$ levels, $k-1$ dummy variates appear in the model.

Related Question