Multi-group SEM Interpretation – How to Interpret Group Differences with Weak Measurement Invariance

lavaanstructural-equation-modeling

Context: I am using SEM (in lavaan) on a sample of about 1000 children between 6-16 years who solved several cognitive tasks. The goal was to establish a model for how different cognitive domains predict reasoning abilities; which has worked very well for the complete sample. The measurement model showed a good fit, and so did my structural model. Here's a picture of my structural model for some background:
enter image description here
The next thing I'm interested in is how age groups differ regarding these structural relations. I expect them to differ because cognitive processes might vary depending on developmental stage. I have split my sample into five age groups of n = 200 each. The splits are based on a theory, and my hypotheses are that in specific age groups, the connection between WM and Gf is higher / lower than in others.

Questions: I've established weak measurement invariance in my measurement model – when I constrain the indicator variables' loadings to be the same over all age groups, the model still shows a good fit compared to an unconstrained model. However, when constrain loadings and intercepts to be the same, the AFIs for the measurement model get significantly worse.

  1. How do I interpret the scalar non-invariance? In other words, what does it mean that intercepts differ across age groups? I'm new to SEM and struggle with understanding the concept of intercepts here.
  2. Which parameters can I compare between the age groups? My main interest lies in comparing the regression parameters in the SEM (e.g., "in age group 1, the path coefficient between WM and Gf is higher compared to age group 2"), but I'm not sure if that is "allowed" given the weak measurement invariance in the measurement model.

I'd be eternally grateful for help with this and can provide more information if my description wasn't clear enough!

Best Answer

Rather than discretizing age into arbitrary categories (which can have negative consequences; MacCallum et al., 2002), you could keep using age as a predictor of the latent factors. This is called a multiple-indicator multiple-cause (MIMIC) model, and it can be used to test invariance as an alternative to multigroup CFA (see Kolbe et al., 2019, 2021, for reviews of single-group approaches for invariance tests).

Regarding your questions:

  1. In the MIMIC model, you can test for differential item/indicator functioning (DIF) by additionally regressing an indicator on age. If there is a significant direct effect of age on an indicator, then the indicator mean is a function of age even among subjects with the same factor score (standard interpretation of a partial regression slope). If not, then the relationship between age and an indicator is fully mediated by the factor (Montoya & Jeon, 2020).
  2. Comparison of regression slopes only requires metric ("weak") invariance.

If you use the MIMIC approach, product indicators can be used to define latent interactions with age (Kolbe & Jorgensen, 2017, as a lavaan example).

Moderated nonlinear factor analysis (MNLFA) is a more intuitive option (Bauer et al., 2017, 2020), but that is not an option for lavaan. It should be possible using OpenMx.

References:

  • Bauer, D. J. (2017). A more general model for testing measurement invariance and differential item functioning. Psychological Methods, 22(3), 507–526. https://doi.org/10.1037/met0000077
  • Bauer, D. J., Belzak, W. C., & Cole, V. T. (2020). Simplifying the assessment of measurement invariance over multiple background variables: Using regularized moderated nonlinear factor analysis to detect differential item functioning. Structural Equation Modeling, 27(1), 43-55. https://doi.org/10.1080/10705511.2019.1642754
  • Kolbe, L., & Jorgensen, T. D. (2017). Using product indicators in restricted factor analysis models to detect nonuniform measurement bias. In Quantitative Psychology (pp. 235-245). Springer. https://doi.org/10.1007/978-3-319-77249-3_20
  • Kolbe, L., & Jorgensen, T. D. (2019). Using restricted factor analysis to select anchor items and detect differential item functioning. Behavior Research Methods, 51(1), 138-151. https://doi.org/10.3758/s13428-018-1151-3
  • Kolbe, L., Jorgensen, T. D., & Molenaar, D. (2021). The impact of unmodeled heteroskedasticity on assessing measurement invariance in single-group models. Structural Equation Modeling, 28(1), 82-98. https://doi.org/10.1080/10705511.2020.1766357
  • Montoya, A. K., & Jeon, M. (2020). MIMIC Models for Uniform and Nonuniform DIF as Moderated Mediation Models. Applied Psychological Measurement, 44(2), 118-136. https://doi.org/10.1177/0146621619835496
Related Question