Power Analysis – Conducting A Priori Power Analysis for Generalized Linear Mixed-Effects Models

generalized linear modelmixed modelsimulationstatistical-power

Let's say I have a data set (with a binary DV) which I used a generalized linear mixed-effects model on. I am currently trying to run a simulation-based power analysis to determine the sample size I should try to collect for a replication attempt.

I know about the simulate function for fitted lme4 model objects, which will create simulated responses that follow the size and structure of the actual dataset.

My question is, how can I use this fitted model object to run an a-priori power analysis for my replication?. In other words, is there an easy way to extract the necessary information from the simulate function, vary the sample size, then to simulate responses?

EDIT:

Here is a simple example with ten subjects that should give me a good starting place. The data here are totally fake so the effects are not close to what I have in the pilot data.

The two variables of interest are age in months (between subjects) and whether the stimulus is "similar" or "dissimilar" (within subjects). Each individual is exposed to three items within each similarity group, for a total of six binary responses for each subject. As seen in the model, I am interested in a possible interaction between age and similarity and include a random slope of similarity to do this.

Let me know if I need to modify anything/provide more information to make things easier.

set.seed(28)

fakeData <- data.frame(ID = rep(1:10, each = 6),
                       months = rep(round(runif(10, min = 60, max = 80)), each = 18),
                       similarity = rep(c("Sim", "Dissim"), each = 3),
                       response = sample(c(0, 1), 60, replace = TRUE))

fakeModel <- glmer(response ~ months * similarity + (similarity | ID), family = binomial, data = fakeData)
summary(fakeModel)

Best Answer

Generate data:

set.seed(28)
fakeData <- data.frame(ID = rep(1:10, each = 6),
          months = rep(round(runif(10, min = 60, max = 80)), each = 18),
          similarity = rep(c("Sim", "Dissim"), each = 3),
          response = sample(c(0, 1), 60, replace = TRUE))

Fit initial model:

library(lme4)
fakeModel <- glmer(response ~ months * similarity + (similarity | ID), 
                   family = binomial, data = fakeData)
summary(fakeModel)

Now I'm going to write a generic function that replicates your data-generation code above. (I'm a little bit nervous about the way you generated your data by implicit replication of the vectors -- I would prefer to explicitly write out the expected amount of replication for each column. I'm not sure that the code below will work properly if you change the numbers ...)

datfun <- function(nindiv=10,nperindiv=6,
                   nsim=3, monthmin=60, monthmax=80) {
      data.frame(ID=factor(rep(1:nindiv, each=nperindiv)),
      months = rep(round(runif(nindiv, min=monthmin, max=monthmax)),
                    each=nsim*nperindiv),
      similarity = rep(c("Sim","Dissim"), each=nsim),
      response = sample( 0:1, nindiv*nperindiv, replace=TRUE))
}

Once you've done this, however, it's easy (using the development version of lme4) to simulate new responses with the same parameters as the initial model (or to change the parameters as you choose):

simdat <- simulate(formula(fakeModel), newdat=datfun(),
                   newparam=list(theta=getME(fakeModel,"theta"),
                                 beta=getME(fakeModel,"beta")),
                   family=binomial)

The result is a data frame with 1 column. If you want to do this multiple times per data set size you may want to set nsim larger ...

Related Question