Solved – AR-M correlation structure in GEE

generalized-estimating-equationsmixed modelrregression

I am trying to use the GEE package in R to fit a GEE model to some clinical trial data. The model fits fine using independent, or exchangeable correlation structures. I'm trying to use an AR-1 structure as follows:

eff.gee.ar <- gee(cluster_severity ~ logtime + cluster + cluster*logtime,
           corstr = "AR-M", Mv = 1,
           id = ID,  data=ral2)

but this is causing the following error message:

Error in gee(cluster_severity ~ logtime + cluster + cluster * logtime,  : 
   VC_GEE_covlag: arg has > MAX_COVLAG rows

The data are sensitive so I cannot share them, but here is the head() view, to give you an idea of how they are arranged. They are sorted so that the cluster (ID) is always contiguous (as instructed in the GEE package). There are ~70k rows to the ral2 data frame.

> head(ral2, n=20)
   ID WEEK  cluster cluster_severity  logtime
1   1    0 cluster1        2.0000000 0.000000
2   1    0 cluster2        0.7500000 0.000000
3   1    0 cluster3        1.5000000 0.000000
4   1    0 cluster4        2.3333333 0.000000
5   1    2 cluster1        1.4000000 1.098612
6   1    2 cluster2        0.0000000 1.098612
7   1    2 cluster3        1.0000000 1.098612
8   1    2 cluster4        2.3333333 1.098612
9   1    4 cluster1        0.2000000 1.609438
10  1    4 cluster2        0.0000000 1.609438
11  1    4 cluster3        0.7500000 1.609438
12  1    4 cluster4        3.0000000 1.609438
13  1    6 cluster1        0.4000000 1.945910
14  1    6 cluster2        0.0000000 1.945910
15  1    6 cluster3        0.5000000 1.945910
16  1    6 cluster4        2.0000000 1.945910
17  2    0 cluster1        1.8000000 0.000000
18  2    0 cluster2        0.2500000 0.000000
19  2    0 cluster3        0.7500000 0.000000
20  2    0 cluster4        0.6666667 0.000000

Any advice or illumination regarding this error message would be greatly appreciated. Google only returns 6 results for the error message :(.

My only intuition so far is that the AR-1 structure cannot be fit to individuals for whom there is only one data point (e.g. subjectID == 2 in the above illustration), although for some reason I expected that a GEE would be fine with this missingness.

On a related note, is it true that GEE's are robust to mis-specification of the correlation structure anyway? This seems to be word on the street, but I haven't found citations for/against this view.

Thanks in advance to any GEE-ers.

Best Answer

Apparently GEEs are a lonely topic in crossvalidated these days :(

I think the AR-M structure didn't work because there was high correlation between some of the random effects in the model. I fixed it by trying harder to model the data in a mixed-effects regression framework with an appropriate random effects setup.

I still haven't seen any definitive answer on mis-specification of the correlation structure in GEEs, but in my experience it doesn't seem to affect fixed effects estimates much (usually the same to 3.s.f). Perhaps it is more of an issue in smaller datasets, this one is reasonably large.