Solved – Power analysis of repeated measures ANOVA using simulation in R

anovarrepeated measuressimulationstatistical-power

I would appreciate another opinion on this simulation approach to estimating power for a within-subjects (or repeated-measures) ANOVA; in this case a 2×2 factorial design. In other words, is there anything WRONG with this method that I've missed? The code below is R.


library(ez)

nsub = 30
nconds = 4

nsims = 1000

#create an empty matrix that will be populated with p-values 
p = matrix(NA, nrow=nsims, ncol=3)

#subject vector
sub = sort(rep(1:nsub, nconds))

#2x2 factorial design
cond = data.frame(x1=c('a','a','b','b'), x2=c('c','d','c','d'))

# fixed effects 
intercept = rep(-1, nconds)
true.effect.x1 = c(0, 0, .5, .5)
true.effect.x2 = c(0, .5, .5, .5)
X = rep((intercept + true.effect.x1 + true.effect.x2),nsub)

#simulation loop
for (x in 1:nsims){

  #random effects

  #relatively large subject-specific variance in intercepts
  sub.intercept = rep(rnorm(nsub, mean=0, sd=2), times=1, each=nconds) 

  #relatively small by-subject adjustments to the effects
  sub.effect = rep(rnorm(nsub, mean=0, sd=0.05), times=1, each=nconds) 

  #unexplained error
  error = rnorm(nsub*nconds, mean=0, sd=1)

  #simulated dependent variable
  observed.y = X + (sub.intercept + sub.effect + error)

  #place everything in a data frame
  df = cbind(sub, cond, observed.y)
  names(df) = c('sub','x1','x2','y')
  df$sub = as.factor(df$sub)

  #extract the p-values for each effect from a repeated measure ANOVA (ezANOVA from 'ez' package)
  p[x,1] = ezANOVA(data=df, dv=.(y), wid=.(sub), within=.(x1, x2))$ANOVA[1,5]
  p[x,2] = ezANOVA(data=df, dv=.(y), wid=.(sub), within=.(x1, x2))$ANOVA[2,5]
  p[x,3] = ezANOVA(data=df, dv=.(y), wid=.(sub), within=.(x1, x2))$ANOVA[3,5]
}

###### p-values < .05 ? ######
sig.x1 = ifelse(p[,1] <= .05, 1, 0)
sig.x2 = ifelse(p[,2] <= .05, 1, 0)
sig.int = ifelse(p[,3] <= .05, 1, 0)

###### Histograms ######
par(mfrow=c(3,1))
hist(p[,1], 20, xaxp=c(0, 1, 20), col=2, main = paste('power X1:', mean(sig.x1 * 100), '%  with ', nsub, 'subjects'))
hist(p[,2], 20, xaxp=c(0, 1, 20), col=2, main = paste('power X2:', mean(sig.x2 * 100), '%  with ', nsub, 'subjects'))
hist(p[,3], 20, xaxp=c(0, 1, 20), col=2, main = paste('power interaction:', mean(sig.int * 100), '%  with ', nsub, 'subjects'))

Best Answer

While there are a few places you could optimize your code a little bit, you seem to have everything covered. One this to keep in mind is that even though you have a repeated-measures design, there is only one observation per unique cell condition within each individual. If you had more, you could do a boostrapped power analysis rather than re-simulating a new dataset every time. In that case, you would sample with replacement.