Disclaimer: I'm no expert is the mixed procedure and I simply happened to have similar questions analysing my own data.
According to SPSS 25 manual p. 22 the repeated covariance type is the covariance structure for the residuals. Among others, SPSS provides following structures:
- AR(1)
- Compound Symmetry
- Diagonal
- Unstructured
The only reason I picked those 4 (out of 22) is that I found those more relevant to my own research
On p.80 of the same manual we can find a brief explanations:
- AR(1). This is a first-order autoregressive structure with homogenous
variances. The correlation between any two elements is equal to rho
for adjacent elements, rho2 for elements that are separated by a
third, and so on. is constrained so that –1<<1.
- Compound Symmetry. This structure has constant variance and constant
covariance.
- Diagonal. This covariance structure has heterogeneous variances and
zero correlation between elements.
- Unstructured. This is a completely general covariance matrix.
I would not go as far as recommending any structure for your data (this question is 6 years old, so you probably sorted it out) but:
for a repeated measure design the default in SPSS is Diagonal.
A. Field (in Discovering Statistics Using SPSS) p. 738 suggests testing different structures on a final model (estimated with ML, not the default REML) and comparing their goodness-of-fit indices (AIC, AICC)
Also this answer might be helpful as well: https://stats.stackexchange.com/a/49786/133561
If anyone can provide a plain English explanation I'd love to hear it and understand some more.
I'm not sure I can provide the kind of answer I'd like to, but I will try to throw out some pieces of information regarding your questions.
First, both @Seth and @gui11aume (+1 to each) have noted that lme()
defaults to no within group correlations. The question is why, and whether that's likely to be a problem. I believe that the thinking is a properly specified multilevel model will account for the covariance amongst your observations such that the residuals are independent. That's why the function was coded to expect no correlations. That is, you may be OK.
Several of your questions concern the effect of having a misspecified variance/covariance structure (bearing in mind that this may not actually apply to you). The estimation of your betas should be unaffected by this, that is, they should be unbiased. However, the estimation of the variance of the sampling distributions will be inaccurate, that is, your p-values will be inaccurate. Moreover, I believe that you cannot say a-priori whether they will be too high or too low. If you are really concerned about these issues you can always use robust (a.k.a., 'sandwich') standard errors. These are typically thought about in the context of generalized linear models, but they can be used elsewhere. Check out the R package sandwich. Note that if they are not necessary, you could be at risk of increased type II errors.
The standard AR(1) variance/covariance structure does assume homoskedasticity, so far as I know. More restrictive, however, is that it assumes every observation was made at the appropriate time, and that all measurements are equally spaced in time. These assumptions usually don't hold, even in the most fortuitous situations, and as such, the AR(1) variance/covariance structure is dangerous to assume.
Remember that the proper specification of the model for the means is crucial. It is remotely possible that time is not relevant to the appropriate model of the mean, but it isn't very likely at all. Leaving TIME out of the model risks the omitted variable bias. Thus, dropping TIME is likely to yield both biased estimates of the means and invalid inferences. This is just not worth gambling on.
Best Answer
Jeff Rouder has done a lot of work on similar issues where he applied Bayesian hierarchical models for response time data. Analyzing these data with the out-of-the-box multilevel model is difficult to fully justify: such data tend to be skewed, with a clear left cut-off (reaction time), so as assumption of normality that you have to make with the standard multilevel software is dubious.
I would also imagine that standardization within subjects kills important information, as the mean and variance within a subject may be linked, which could help identifying the actual distributions you could use to model your RT data.