I have dataset df consist of 8000 observations
org_id property1 property2 property3 uptimeDay event
and org_id is a categorical variable with 1199 different levels. The other two variables or properties of an organization and are numerical.
coxp_1<-coxph(formula = Surv(uptimeDay, event,type='right') ~ (peroperty1 + property3)^2 + property2 + I(as.factor(org_id)), data = df_cox)
I am planning to run the following cox model in R but I keep getting this error msg which I am guessing is caused due to the fact that my categorical variable (org_id) has to many different levels.
Error in fitter(X, Y, strats, offset, init, control, weights = weights, :
NA/NaN/Inf in foreign function call (arg 6)
Does anybody know what could be a potential solution for this problem?
Best Answer
The Cox Proportional Hazards' Model needs your event variable to have at least one event and one non-event (event = 0) for each level of the categorical variable. Otherwise, it's called Perfect Classification. To check this see the results of: xtabs(~event + org_id, data = df_cox)
My guess is since your dataset has 8000 observations and 1199 different level, a solution would be to increase the number of observations or club different levels together.