Solved – How to “nest” data in a mixed-effects model? and MANY related questions about mixed effects models!

mixed modelrandom-effects-modelrepeated measuresspss

I treated 80 people with drug X and 80 with drug Y. I presented the drugs to them in groups of 10 (meow groups). The drug was consumed for 15 weeks. They reported the severity of their headaches (from 1-100) once a week. All individuals are only in one meow group and everyone within a meow group received the same drug. The meow group that a person is in (not only the drug, but the actual meow group of 10) will influence treatment outcomes because the groups meet weekly.

In essence, I have:

2 drugs (treatment)
16 meow groups
160 cases
15 time points

I want to know whether people who received drug X reported less headache intensity over the 15 weeks. Basically, I want to know the headache severity slope for drug X, and whether that differs from drug Y. Please note that I am not particularly interested in the slopes in general, how quickly/when they change, or how the meow groups differed. I am mostly interested in comparing the efficacy of the drugs.

I want to do mixed-modeling. I think treatment condition should be a fixed-factor, time points should be a covariate fixed-factor so that I can estimate a slope over time, and that time points need to be estimated as repeated-measures estimates. I also appreciate that I should run different models and test them and examine fit.

Overall though, what is the most appropriate way to test my hypothesis using a mixed-model design in SPSS (given what information you have)?

This is the SPSS syntax I have so far:
MIXED pain BY drug WITH time /CRITERIA=.... /FIXED=drug time drug*time | SSTYPE(3) /METHOD=REML /RANDOM=INTERCEPT | SUBJECT(id) COVTYPE(ID) /REPEATED=time | SUBJECT(id) COVTYPE(AR1).

(meow group is missing from this syntax because I don't know where it is supposed to go).

I have a general understanding of what a random and a fixed factor is, but I don't quite understand how these impact each other in practice (please note that I can't read most formulas, so plain English will be needed if you want to help on that front!). For example, how would adding drug (or time) as a random effect influence the estimate of the slopes I am interested in? Similarly:

Should time points also be a random effect?
How should I handle the "meow group" variable? Fix or random, or both?
How does specifying a subject variable impact the random effect’s association with the dependent variable?
Is there a minimum number of levels necessary for a variable to be a random effect? I've read that only having two levels does not make it possible.
What are the "levels" in mixed-effects modelling referring to? Is student within class within school within school board within state within country a "6-level" model, or are the levels something else?

Lastly, how can I “nest” pain estimates within times, time within cases, cases within groups, and groups within treatments? Are pain ratings automatically nested within cases and time automatically nested within cases and cases within groups? I cannot "nest" group factors in the subject factors using SPSS it seems.

Sorry for all the questions… I figured putting this all in one related post would be better than spamming the front page with 10 questions. I have read previous related answers on this website without finding answers to these questions (but answers to other questions of course!).

Answers to any of these questions would be appreciated (even info on a single question would be great)!

Best Answer

I know this is not exactly what you asked but in R (http://www.r-project.org/) I would have run this formula: pain ~ drug + time + (1+time|meow) + (1+time|subject)

Some notes:

This should take care of nesting issues.
I gave time both a fixed effect and a random effect (since I'm assuming time has some fixed influence that is not related to group/subject and that each group/subject can react differently to time)
Regarding leveling - in you case the 16 groups and 160 subjects can be used for random effects. What is your concern here?

I hope this answers some of your questions..

Also, I'm currently researching new ways to automate statistical analysis. My focus is on enabling users to easily create mixed models - Will you be willing to share your data with me so I could try and see analysis results with your data...

Related Solutions

Solved – Random effects in mixed models

Here are my answers to your questions:

1) No, you do not need to account for the grouping here, because the random intercepts are estimated for each cluster (here, each person), not for each group--this is why you have the grouping variable in your fixed effects as a predictor of the intercept (the "main effect") or of the slopes (the interaction terms)

2) The RANDOM line specifies the random effect, which is what you seem to want. The REPEAT line is to allow for a different level-1 residual variance-covariance matrix; for example, you can allow residuals from one timepoint to be correlated with the next (autoregressive structure). In this case however, you do not need a REPEAT line, since you have only two timepoints (there is only one residual correlation to be estimated).

3) Yes, it is correct to use TIME as a fixed factor. The FIXED command simply serves to determine which fixed effects will be estimated, much like in GLM. Therefore, you do want the effect of TIME (mean difference from pre to post, controlling for other effects in your model) to be estimated. Where you decide whether to treat TIME as a factor or as a continuous predictor is in the very first line: BY denotes a categorical factor and will be automatically dummy-coded, and WITH denotes a continuous predictor. Therefore, you'd want your first line to be: MIXED Score BY Group Treatment WITH Time

4) Baseline differences between groups are embedded in the Group fixed effect, which represents the mean difference between the two groups, when $\text{Time}=0$ since there is an interaction term between Group and Time. Therefore, where you decide to center your time variable (i.e. which timepoint gets the value 0) is crucial. By including the Group fixed effect, you are controlling for differences between groups at $\text{Time}=0$.

Mixed Model – Can Random Effects Be Nested Within a Factor with Only Two Observations in Linear Mixed Models?

There seems to be some confusion.

My understanding is that because there's only 2 participants in each family, I can't specify random slopes because there are insufficient degrees of freedom

This does not make sense to me. In general, random slopes do not make sense when the variable in question does not vary with subjects. So if you have repeated measures within levels of a grouping variable, then you can, in principle, fit random slopes (provided that they are supported by the data)

On the other hand, depending on what level you are taking your measurements / observations, the model may be mispecified. If you are measuring variables at the family level - eg father's ethnicity or mother's education level; or if you are making repeated measures at the family level - eg annual household income over several years, or family address (which may change over time), then the proposed model should be a good place to start. However if you are making repeated measures per twin, then you will need to fit random intercepts for twin ID, varying within family:

lmer(x ~ y + (1|familyID/twinID), dat)

Regarding the issue of nesting:

I'm wondering if this has any impact on whether you can nest other random effects within the family random effect.

There is no reason why you can't have further random effects nested within family. For example, as mentioned above, if you have repeated measures within individual twins then you would fit nested random effects. Although there are only 2 twins in each family, when fitting nested random effects it is the number of levels of the upper level factor that is important, since:

x ~ y + (1|familyID/twinID)

is exactly the same ae

x ~ y + (1|familyID) + (1 |familyID:twinID)

There will always be more levels of familyID:twinID than just familyID, so the constraint in terms of group sizes is familyID, not twinID

Best Answer

Related Solutions

Solved – Random effects in mixed models

Mixed Model – Can Random Effects Be Nested Within a Factor with Only Two Observations in Linear Mixed Models?

Related Question