Solved – How does clogit() (in R) handle incomplete strata

clogitmissing datapaired-data

I am running conditional logistic regression in R using clogit(). I have 314 different strata with 1 case and 1 control in each stratum (628 observations in total). Several predictor variables have missing values, therefore 6 observations are excluded from the analysis. Now I have 622 observations with 310 events. Two strata now contain 0 case and 1 control. I thought such strata would be omitted from the analysis, however it is not the case. 622 residuals are reported. How does clogit handle strata where pairs have one value missing?

Best Answer

The residual will be zero for the remaining observation in that stratum. There's no need to remove it, since it doesn't provide any information if there were only two observations in the stratum.

> library(survival)
> data(retinopathy)
> head(retinopathy)
  id laser   eye age     type trt futime status risk
1  5 argon  left  28    adult   1  46.23      0    9
2  5 argon  left  28    adult   0  46.23      0    9
3 14 argon right  12 juvenile   1  42.50      0    8
4 14 argon right  12 juvenile   0  31.30      1    6
5 16 xenon right   9 juvenile   1  42.27      0   11
6 16 xenon right   9 juvenile   0  42.27      0   11
> 
> allmodel<- clogit(status~trt+strata(id),data=retinopathy)
> allmodel
Call:
clogit(status ~ trt + strata(id), data = retinopathy)

      coef exp(coef) se(coef)      z       p
trt -1.371     0.254    0.280 -4.896 9.8e-07

Likelihood ratio test=29.9  on 1 df, p=4.544e-08
n= 394, number of events= 155 
> resid(allmodel)[1:4]
         1          2          3          4 
 0.0000000  0.0000000 -0.2025316  0.2025316 
> 
> retinopathy$trt[3]<-NA
> missmodel<- clogit(status~trt+strata(id),data=retinopathy)
> missmodel
Call:
clogit(status ~ trt + strata(id), data = retinopathy)

       coef exp(coef) se(coef)      z        p
trt -1.3545    0.2581   0.2804 -4.831 1.36e-06

Likelihood ratio test=28.97  on 1 df, p=7.344e-08
n= 393, number of events= 155 
   (1 observation deleted due to missingness)
> resid(missmodel)[1:3]
1 2 4 
0 0 0

[If there had been more than two observations in the stratum to begin with, the stratum would still be informative, of course. The residuals would then be what you'd expect for the remaining data in the stratum]