Solved – Recommended effect size measure for repeated measures t-tests and repeated measures contrasts on experimental data

contrastseffect-sizerepeated measures

Context

This came up recently in a consulting context. A researcher was performing repeated measures t-tests based on experimental data.
Some of the analyses involved comparing one condition with another. Other analyses involved performing contrasts comparing one or more conditions with one or more other conditions.

Question

What effect size measure would you recommend using in relation to (a) repeated measures t-tests (b) repeated measures contrasts comparing one or more group means with one or more other group means?
If you were reporting a d-based effect size measure, which measure of the within group standard deviation would you use?
Are there any references that you would recommend?

I have a few thoughts, but I'm keen to get your suggestions.

Best Answer

The answer here depends on your situation. Dunlap, Cortina, Vaslow, and Burke (1996) argued that the effect size should be calculated using a SD based on pooled variance from separate conditions, as is typical in independent groups studies, even with repeated measurements. Their argument was that the study may be replicated with a between design and the effect sizes will be more comparable across studies in meta-analysis with that effect size measure. They asserted that the effect size is the effect size and shouldn't be influenced by the correlation in the measurement in a repeated measures design.

Unfortunately, this suggestion has been overgeneralized in some literatures (and in Cortina's book I believe). When it's not possible to design an experiment any other way than repeated measures then using the between S effects size is a mistake. It will underestimate the size of the effect and be useless in power calculations.

Imagine an attentional cueing study where you need to study a single mental state, (e.g. oriented in a direction indicated by an arrow), and have to measure the effect comparing performance at the indicated location and one that is not. That study has to be done within and there is no other way to do it. In that case, the need to have an effect size comparable to situations where the study is done with independent groups vanishes because the independent group study couldn't occur. Using the between S effect size would not be a useful estimate in seeing the number of S's to replicate the study while the within would. The between S effect size would tend to vastly underestimate what you actually needed to measure, which is the effect within.

Related Solutions

Solved – How to specify specific contrasts for repeated measures ANOVA using car

This method is generally considered "old-fashioned" so while it may be possible, the syntax is difficult and I suspect fewer people know how to manipulate the anova commands to get what you want. The more common method is using glht with a likelihood-based model from nlme or lme4. (I'm certainly welcome to be proved wrong by other answers though.)

That said, if I needed to do this, I wouldn't bother with the anova commands; I'd just fit the equivalent model using lm, pick out the right error term for this contrast, and compute the F test myself (or equivalently, t test since there's only 1 df). This requires everything to be balanced and have sphericity, but if you don't have that, you should probably be using a likelihood-based model anyway. You might be able to somewhat correct for non-sphericity using the Greenhouse-Geiser or Huynh-Feldt corrections which (I believe) use the same F statistic but modify the df of the error term.

If you really want to use car, you might find the heplot vignettes helpful; they describe how the matrices in the car package are defined.

Using caracal's method (for the contrasts 1&2 - 3 and 1&2 - 4&5), I get

      psiHat      tStat          F         pVal
1 -3.0208333 -7.2204644 52.1351067 2.202677e-09
2 -0.2083333 -0.6098777  0.3719508 5.445988e-01

This is how I'd get those same p-values:

Reshape the data into long format and run lm to get all the SS terms.

library(reshape2)
d <- OBrienKaiser
d$id <- factor(1:nrow(d))
dd <- melt(d, id.vars=c(18,1:2), measure.vars=3:17)
dd$hour <- factor(as.numeric(gsub("[a-z.]*","",dd$variable)))
dd$phase <- factor(gsub("[0-9.]*","", dd$variable), 
                   levels=c("pre","post","fup"))
m <- lm(value ~ treatment*hour*phase + treatment*hour*phase*id, data=dd)
anova(m)

Make an alternate contrast matrix for the hour term.

foo <- matrix(0, nrow=nrow(dd), ncol=4)
foo[dd$hour %in% c(1,2) ,1] <- 0.5
foo[dd$hour %in% c(3) ,1] <- -1
foo[dd$hour %in% c(1,2) ,2] <- 0.5
foo[dd$hour %in% c(4,5) ,2] <- -0.5
foo[dd$hour %in% 1 ,3] <- 1
foo[dd$hour %in% 2 ,3] <- 0
foo[dd$hour %in% 4 ,4] <- 1
foo[dd$hour %in% 5 ,4] <- 0

Check that my contrasts give the same SS as the default contrasts (and the same as from the full model).

anova(lm(value ~ hour, data=dd))
anova(lm(value ~ foo, data=dd))

Get the SS and df for just the two contrasts I want.

anova(lm(value ~ foo[,1], data=dd))
anova(lm(value ~ foo[,2], data=dd))

Get the p-values.

> F <- 73.003/(72.81/52)
> pf(F, 1, 52, lower=FALSE)
[1] 2.201148e-09
> F <- .5208/(72.81/52)
> pf(F, 1, 52, lower=FALSE)
[1] 0.5445999

Optionally adjust for sphericity.

pf(F, 1*.48867, 52*.48867, lower=FALSE)
pf(F, 1*.57413, 52*.57413, lower=FALSE)

Solved – do a PCA on repeated measures for data reduction

You could look into Multiple Factor Analysis. This can be implemented in R with FactoMineR.

UPDATE:

To elaborate, Leann was proposing – however long ago – to conduct a PCA on a dataset with repeated measures. If I understand the structure of her dataset correctly, for a given 'context' she had an animal x 'specific measure' (time to enter, number of times returning to shelter, etc) matrix. Each of the 64 animals (those without missing obs.) were followed three times. Let's say she had 10 'specific measures', so she would then have three 64×10 matrices on the animals' behaviour (we can call the matrices X1, X2, X3). To run a PCA on the three matrices simultaneously, she would have to 'row bind' the three matrices (e.g. PCA(rbind(X1,X2,X3))). But this ignores the fact that the first and 64th observation are on the same animal. To circumvent this problem, she can 'column bind' the three matrices and run them through a Multiple Factor Analysis. MFA is a useful way of analyzing multiple sets of variables measured on the same individuals or objects at different points in time. She'll be able to extract the principle components from the MFA in the same way as in a PCA but will have a single coordinate for each animal. The animal objects will now have been placed in a multivariate space of compromise delimited by her three observations.

She would be able to execute the analysis using the FactoMineR package in R. Example code would look something like:

df=data.frame(X1, X2, X3)
mfa1=MFA(df, group=c(10, 10, 10), type=c("s", "s", "s"), 
 name.group=c("Observation 1", "Observation 2", "Observation 3")) 
 #presuming the data is quantitative and needs to be scaled to unit variance

Also, instead of extracting the first three components from the MFA and putting them through multiple regression, she might think about projecting her explanatory variables directly onto the MFA as 'supplemental tables' (see ?FactoMineR). Another approach would be to calculate a Euclidean distance matrix of the object coordinates from the MFA (e.g. dist1=vegdist(mfa1$ind$coord, "euc")) and put it through an RDA with dist1 as a function of the animal specific variables (e.g. rda(dist1~age+sex+pedigree) using the vegan package).