Solved – Accounting for grouped random effects in lme4

lme4-nlmemixed modelrrandom-effects-modelrepeated measures

I am editing my question as it was not detailed enough. I made an (unsuccessful) shortcut. Sorry, here is the entire story.

In my experiment I test subjects' reactions to some (simulated) situations. The subject read a scenario and then an expert evaluates the subject's behavior. The evaluation ranges from 1 to 5. There are 10 different simulations, and each subject takes all of them; thus, from each subject I have 10 data points. My experiment is went on 30 days. In each day, the same 10 simulations are used. In other words, in each day, subjects and simulations are fully crossed. Each day the simulations are different.

There are 3 categories of simulations (A, B and C). The categories are, from a theoretical perspective, different one from the other. Category A is tested by 3 simulations (a1, a2, a3); B by 3 (b1, b2, b3); C by 4 (c1, c2, c3, c4). a1:c4 at day 1 are different from a1:c4 at day 2 and so on. Importantly, participants took the experiment one time only and are not allowed to participate again. That is, if participant participates on, let say day 1, he will take the 10 simulations that were on day 1. But he could never participate again.

The 3 categories are the only ones I am interested in. In that sense, I think they should be treated as fixed effects. Each category is tested/represented by some simulations. Yet, for each category there are an infinity of possible simulations and I just sampled some. In that sense simulations are random.

The only question I am interested in here, is about the effect of the subject gender on the grades. I want to control for all other parameters. My question is how to account for the simulation and category. I would like to extrapolate my results beyond participants and the simulations representing the category. Yet it is also possible that gender would interact with category or simulation.

So here is one "basic" model:

lmer(grade ~ gender + (1|subject) + (1|simulation:day), data = My_data)

Yet, this model does not account for the possibility that gender has a different effect on simulations. So here is another one trying that.

lmer(grade ~ gender + (1|subject) + (1 + gender|simulation:day), data = 
My_data)

And here I get stuck. How does category play a role? Do I need to enter it as fixed effect? If yes, what about simulations? Does the following make sense?

lmer(grade ~ gender*category + (1 + category|subject) + (1 + 
gender|simulation:day), data = My_data)

Or is it better to give up the simulations, and treat category as random? But in this case, for a given day, the same category will appear several time for each subject (e.g., A will appear 3 times). Is't that a problem? As follow:

lmer(grade ~ gender + (1 + subject) + (1 + gender|category), 
data =My_data)

A final point: I have a lot of data (several thousands of participants), so convergence should not be a problem.

Thanks a lot for the help

Best Answer

A couple of points:

  • If I understood correctly, your response variable is a grade, ranging from 1 to 5. This is an ordinal variable with relatively few levels. Hence, assuming a normal distribution for the residuals may not be appropriate. You could consider a mixed model for ordinal data instead, such as the continuation ratio model.
  • You are right that subject and simulation seem to be fully crossed factors. Hence, a possible model to consider is:

    grade ~ category * gender + (1 | subject) + (1 | simulation)