Mixed Model – Multiple Membership vs Crossed Random Effects Explained

mixed modelmultilevel-analysismultiple-membershiprandom-effects-modelrepeated measures

I see that there is a multiple-membership tag, but I can't find a good explanation of what a multiple membership model is, or how to go about fitting one.
In my limited understanding, it seem very similar to a cross-classified model. That is, units in one level don't "belong" to a single level in another level – they can belong to many. So, in a healthcare setting, a patient might be treated in one hospital for one condition, and in another hospital for another condition, so patients are not nested in hospitals – they seem crossed. Is this multiple membership ? If so how is it different from cross-classified models. I know that cross classified models are very common in the mixed modelling world, so I assume it is the same with multiple membership, although I do not see much about multiple membership in the mixed models literature.

Are multiple membership models the same as cross classified models ? In this answer, it is stated:

"the latter is a crossed design (some might also call it multiple membership)"

This leads me to think that they are the same, although it is somewhat ambiguous.

If not, then what are they are how do we fit them ?

Best Answer

Note this has been edited to address the issue of how to construct the model matrix for the random effects.

I agree that this can be confusing. But before answering, I would just like to be a bit pedantic and mention that multiple membership (and nesting, and crossing) is not a property of the model. It is a property of the experimental/study design, which is then reflected in the data, which is then encapsulated by the model.

Are multiple membership models the same as cross classified models ?

No they are not. The reason why my answer that you linked to is ambiguous on this is because some people, erroneously in my opinion, use the two terms interchangeably in certain situations (more on this below), when in fact they are quite different (in my opinion).

The example you mentioned, patients in hospitals, is a very good one. The key here is to think about the lowest level of measurement, and where the repeated measures occur. If patients are the lowest level of measurement (that is, there are no repeated measures within patients), then patient will not be a grouping variable; that is, we would not fit random intercepts for it, so by definition there cannot be crossed random effects involving patient. On the other hand, if there are repeated measures within patients then we would fit random intercepts for patients, and therefore we would have crossed random effects for patient and hospital. In the former case we would call this a model with multiple membership, but in the latter case we would call it a model with crossed random effects (in reality it will probably be partially nested and partially crossed). Some people seem to consider both to be multiple membership, and the latter to be just a special case (hence my ambiguous statement in the linked answer). I just think this confuses the situation.

So to give a definition of multiple membership, I would say this occurs when the lowest level units "belong" to more than one upper-level unit. So, following the same example,

where there are no repeated measures within patients, the lowest level unit is patient; if a patient is treated in more than one hospital we have multiple membership
if there are repeated measures within patients, then the lowest level unit is the measurement occasion, which is nested within patients, and patients are (probably partially) crossed with hospitals.

how do we fit them ?

In the multilevel modelling world, software such as MLwiN can fit multiple membership models "out of the box". With mixed effects models, things are not straightforward, at least with the packages I am familiar with. The problem is that the data will look something like this:

  Y  PatientID  HospA  HospB  HospC  HospD  HospE  HospF  HospG  HospH

0.1          1      1      0      0      0      0      1      0      1
0.5          2      0      1      0      0      0      1      0      0
2.3          3      0      0      1      0      0      1      0      0
0.7          4      1      0      0      0      0      0      1      0
1.0          5      0      1      0      0      0      1      0      1
3.2          6      0      0      0      0      0      1      0      0
2.1          7      0      0      0      0      0      0      1      0
2.6          8      0      0      0      0      1      0      0      1

Other representations of the data are obviously possible but I think this makes most sense, and makes what follows easier to understand. Edit: It also makes the construction of the model matrix for the random effects quite straightforward (see the edit below).

Clearly it does not make any sense to fit random intercepts for each hospital. However, we have repeated measures within hospitals, so we need to account for this somehow, since observations within hospitals are more likely to be similar to each other than to observations in other hospitals. Moreover, not only is there likely to be correlations within hospitals, but each hospital that a patient belongs to contributes to the (single) measured outcome for that patient.

I don't know if there is an agreed upon way to handle this with mixed models, but Doug Bates and Ben Bolker have both shown how it can be done in lme4:

https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q2/006318.html
https://rstudio-pubs-static.s3.amazonaws.com/442445_4a48ad854b3e45168708cfe4f007d544.html

I won't mention the specifics of how to do it in lme4, but the idea is to

Create a dummy grouping variable (HospitalID with levels A - H using the above example).
Fit a model with random intercepts for the dummy. Some software (e.g., lme4) allows the model to be constructed internally without actually fitting it. We don't need it to be fitted - only to create the model matrix.
Construct the correct model matrix for the random effects yourself. This will be based on the HospA - HospH columns of the above example.
Update the model with the correct model matrix.
(Re)fit the updated model.

Edit: to address the question of how to construct the model matrix for the random effects

In a mixed model setting, we usually work with the general mixed model formula:

$$ y = X \beta + Zu + \epsilon$$

In the above example, we want to fit random intercepts for hospitals. The purpose of the model matrix $Z$ is to map the relevant random effects, $u$, onto the response. In the above example we have 8 hospitals. Therefore, the random effects (random intercepts) will be a vector of length 8. For simplicity let's say that it is:

$$ u = \begin{bmatrix} 1 \\ 2 \\ 3 \\ 4 \\ 5 \\ 6 \\ 7 \\ 8 \end{bmatrix} $$

Now, if we look at patient 1, they are in hospitals A, F and H. So that patient will get a contribution of 1 from from hospital A, 6 from hospital F and 8 from hospital H. We could alternatively write this as:

$$ (1 \times 1) + (0 \times 2) +( 0 \times 3) + (0 \times 4) + (0 \times 5) + (1 \times 6) + (0 \times 7) + (1 \times 8) $$

We can now see that this is exactly the dot product of two vectors:

$$ \begin{bmatrix} 1 & 0 & 0 & 0 & 0 & 1 & 0 & 1 \end{bmatrix} \begin{bmatrix} 1 \\ 2 \\ 3 \\ 4 \\ 5 \\ 6 \\ 7 \\ 8 \end{bmatrix} $$

We can now observe that the row-vector above is exactly the same as the row in the data for the hospitals:

  Y PatientID HospA HospB HospC HospD HospE HospF HospG HospH

0.1         1     1     0     0     0     0     1     0     1

Therefore each row of the model matrix is simply the corresponding row of the hospital "membership" indicators, and the full structure of $Zu$ for the above data is:

$$ Zu = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 & 1 & 0 & 1 \\ 0 & 1 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 & 0 & 1 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 1 \end{bmatrix}\begin{bmatrix} 1 \\ 2 \\ 3 \\ 4 \\ 5 \\ 6 \\ 7 \\ 8 \end{bmatrix} $$

Best Answer

Related Solutions

Mixed Models – Nested Crossed Random Effects for Repeated Measures Data in R

Related Question