I am having a bit of trouble with my survival data.
Basically I am trying to assess whether or not the cox proportional hazard assumption is met, by ploting the schoenfeld residuals in R.
I have several variables and some of them have 3 or more categories.
As far as I know, Schoenfeld residuals are adjusted for each individual and each variable. So, when I tried to get the residuals for each of this variables with 3 or more categories, I was expecting to have one residual for individual, but instead I got more.
It's my first time using R with survival data, so probably I am doing something wrong. Here is the code I used with a variable that has 4 categories.
>fit=survfit(s~data$school)
#Cox Model
>summary(coxph(s~data$school))
#Test for proportional hazards assumption
>cox.zph(coxph(s~data$school))
#Schoenfeld residuals (defined for each variable for uncensored subjects)
>res=residuals(coxph(s~data$school), type="schoenfeld", collapse=F)
When I call the first 3 rows (time = 7 days), I get a set of 3 residuals per individual:
> res[1:3,]
data$school2 data$school3 data$school4
7 -0.6250424 -0.1333578 0.8462863
7 -0.6250424 -0.1333578 0.8462863
7 0.3749576 -0.1333578 -0.1537137
Can anyone please help me we this?
Best Answer
There is a separate Schoenfeld residual for each covariate for every uncensored individual.
In the case of a categorical covariate with $k$ levels, $k-1$ dummy variates appear in the model.