Solved – How to specify in r spatial covariance structure similar to SAS sp(pow) in a marginal model

generalized-least-squarespanel datarsasspatial

I'm currently translating existing code from SAS to R. I'm working on longitudinal data (CD4 count over time). I have the following SAS code :

Proc mixed data=df;
class NUM_PAT;
model CD4t=T /s ;
repeated / sub=NUM_PAT type=sp(pow)(T);

The SAS spatial power covariance structure is useful for unequally spaced longitudinal measurements where the correlations decline as a function of time (as shown by the picture below).

I think I have to use gls( ) from {nlme} since I don't have any random effects. As R 'only' provides "spherical", "exponential", "gaussian", "linear", and "rational" as correlation spatial structures, my guess is that I need to use corSpatial plus a weights argument.

I tried the following code, but it doesn't work :

gls(CD4t~T, data=df, na.action = (na.omit), method = "ML",
corr=corCompSymm(form=~1|NUM_PAT), weighhts=varConstPower(form=~1|T))

What am I doing wrong ?

Thanks for any help.

Best Answer

The spatial power covariance structure is a generalization of the first-order autoregressive covariance structure. Where the first-order autoregressive structure assumes the time points are equally spaced, the spatial power structure can account for a continuous time point. In reality, we could just forget the first-order autoregressive structure entirely, because if we fit the spatial power structure when the data are equally spaced we'll get the same answer as when using the first-order autoregressive structure.

All that aside, the correlation function you're looking for is corCAR1(), which is the continuous first-order autoregressive structure. If you're looking to duplicate what you fit in SAS, then the code you're looking for is:

gls(CD4t~T, data=df, na.action = (na.omit), method = "REML",
    corr=corCAR1(form=~T|NUM_PAT))

Of course, you don't need to specify method = "REML", since, as in SAS, the default method in gls() is already restricted maximum likelihood.

Related Solutions

Solved – How to account for spatial covariance in a linear model

1) You can model spatial correlation with the nlme library; there are several possible models you might choose. See pages 260-266 of Pinheiro/Bates.

A good first step is to make a variogram to see how the correlation depends on distance.

library(nlme)
m0 <- gls(response ~ level, data = layout)  
plot(Variogram(m0, form=~x+y))

Here the sample semivariogram increases with distance indicating that the observations are indeed spatially correlated.

One option for the correlation structure is a spherical structure; that could be modeled in the following way.

m1 <- update(m0, corr=corSpher(c(15, 0.25), form=~x+y, nugget=TRUE))

This model does seem to fit better than the model with no correlation structure, though it's entirely possible it too could be improved on with one of the other possible correlation structures.

> anova(m0, m1)
   Model df     AIC      BIC    logLik   Test  L.Ratio p-value
m0     1  3 46.5297 49.80283 -20.26485                        
m1     2  5 43.3244 48.77961 -16.66220 1 vs 2 7.205301  0.0273

2) You could also try including x and y directly in the model; this could be appropriate if the pattern of correlation depends on more than just distance. In your case (looking at sesqu's pictures) it seems that for this block anyway, you may have a diagonal pattern.

Here I'm updating the original model instead of m0 because I'm only changing the fixed effects, so the models should both be fit using maximum likelihood.

> model2 <- update(model, .~.+x*y)
> anova(model, model2)
Analysis of Variance Table

Model 1: response ~ level
Model 2: response ~ level + x + y + x:y
  Res.Df    RSS Df Sum of Sq      F   Pr(>F)   
1     22 5.3809                                
2     19 2.7268  3    2.6541 6.1646 0.004168 **

To compare all three models, you'd need to fit them all with gls and the maximum likelihood method instead of the default method of REML.

> m0b <- update(m0, method="ML")
> m1b <- update(m1, method="ML")
> m2b <- update(m0b, .~x*y)
> anova(m0b, m1b, m2b, test=FALSE)
    Model df      AIC      BIC     logLik
m0b     1  3 38.22422 41.75838 -16.112112
m1b     2  5 35.88922 41.77949 -12.944610
m2b     3  5 29.09821 34.98847  -9.549103

Remember that especially with your knowledge of the study, you might be able to come up with a model that is better than any of these. That is, model m2b shouldn't necessarily be considered to be the best yet.

Note: These calculations were performed after changing the x-value of plot 37 to 0.

Solved – using a number of variables (dumthe) in mixed model analysis in R

The answer to your first question, defining $Z$, is straightforward within the nlme() function itself:

random = pdMat(X1+X2+X3...X10 ~ 1)

Part two, Toeplitz $\zeta$, requires a custom solution. For the general remove-one symmetric structure, check out toeplitz(0:10). To understand how to build a custom corStruct, check out the function corAR1 in the source. All the implementations are in C; rolling your own will take some time.

correlation = corToep()  # this is the general idea

Best Answer

Related Solutions

Solved – How to account for spatial covariance in a linear model

Solved – using a number of variables (dumthe) in mixed model analysis in R

Related Question