How can one go about modelling the interaction between a categorical independent variable and a continous moderator (created through a CFA) in a SEM model using the lavaan package in R?
In particular, in my real dataset I am essentially interested in re-creating a two-way ANOVA in SEM, and also want to include a moderating variable to test with each factor variable.
Example data and problem:
### load packages: ###
library(dplyr)
library(lavaan)
library(car)
library(psychTools)
### Create some data: ###
# Dependent variable: Taken example data from psychTools
DV1 <- bfi$A1 # item 1
DV2 <- bfi$A2 # item 2
DV3 <- bfi$A3 # item 3
# Moderating variable: Taken example data from psychTools
MOD1 <- bfi$C1 # item 1
MOD2 <- bfi$C2 # item 2
MOD3 <- bfi$C3 # item 3
# Create example factor variables
x1 <- c("A","B")
x2 <- c("C","D")
set.seed(1)
FAC1 <- as.factor(sample(x1, 200, replace = TRUE)) # Factor 1, with two levels "A" and "B"
FAC2 <- as.factor(sample(x2, 200, replace = TRUE)) # Factor 2, with two levels "C" and "D"
FAC12 <- interaction(FAC1,FAC2) # Factor 12, interaction of FAC1 and FAC2, with four levels "A.C" "B.C" "A.D" "B.D"
# Combine to data frame
StudyData <- data.frame(DV1,DV2,DV3,
MOD1,MOD2,MOD3,
FAC1, FAC2, FAC12)
### Make all categorical variables numeric for use in SEM (orthogonal contrast coded as in ANOVA): ###
StudyData$FAC1 <- recode(StudyData$FAC1, "c('A')='-1';
c('B')='1'")
StudyData$FAC1 <- as.numeric(levels(StudyData$FAC1))[StudyData$FAC1]
StudyData$FAC2 <- recode(StudyData$FAC2, "c('C')='-1';
c('D')='1'")
StudyData$FAC2 <- as.numeric(levels(StudyData$FAC2))[StudyData$FAC2]
StudyData$FAC12 <- recode(StudyData$FAC12, "c('A.D','B.C')='-1';
c('A.C','B.D')='1'")
StudyData$FAC12 <- as.numeric(levels(StudyData$FAC12))[StudyData$FAC12]
### SEM Model One: ###
Model.one <- '
# cfa
DV =~ DV1 + DV2 + DV3
MOD =~ MOD1 + MOD2 + MOD3
# regressions
DV ~ FAC1 + FAC2 + FAC12
'
Modelone <- sem(Model.one, StudyData, estimator="MLM", effect.coding=TRUE, meanstructure=TRUE)
summary(Modelone)
fitMeasures(Modelone, c("chisq","cfi","rmsea","srmr","nfi","gfi"))
### SEM Model Two: ###
Model.two <- '
# cfa
DV =~ DV1 + DV2 + DV3
MOD =~ MOD1 + MOD2 + MOD3
# regressions
DV ~ FAC1 + FAC1:MOD + FAC2 + FAC12
'
Modeltwo <- sem(Model.two, StudyData, estimator="MLM", effect.coding=TRUE, meanstructure=TRUE)
summary(Modeltwo)
fitMeasures(Modeltwo, c("chisq","cfi","rmsea","srmr","nfi","gfi"))
### EDIT ###
### SEM Model Three: ###
Model.three <- '
# cfa
DV =~ DV1 + DV2 + DV3
MOD =~ MOD1 + MOD2 + MOD3
# regressions
DV ~ FAC2 + MOD
'
Modelthree <- sem(Model.three, StudyData, estimator="MLM", effect.coding=TRUE, meanstructure=TRUE, group="FAC1")
summary(Modelthree)
fitMeasures(Modelthree, c("chisq","cfi","rmsea","srmr","nfi","gfi"))
Model one runs fine. I can run my "ANOVA" in the SEM environment.
However, when I want to run Model two, which includes an interaction term between FAC1 and MOD (as created via CFA in the SEM model), I receive the error:
"lavaan WARNING:
The variance-covariance matrix of the estimated parameters (vcov)
does not appear to be positive definite! The smallest eigenvalue
(= -3.458498e-20) is smaller than zero. This may be a symptom that
the model is not identified."
Questions:
- Is it not possible to create a factor:continuous interaction in
lavaan in this manner? - Are there any work arounds & how to do them? (For example, extract the values calculated during the CFA for MOD and calculate FAC1:MOD interaction outside of the SEM, then re-use the variable in the path analysis (regressions) part of the SEM)
- Can Mplus do this without the need for work arounds?
Best Answer
No, the
:
operator only works on observed variables. It triggerslavaan
to actually calculate the product term and include it in the covariance matrix and mean vector to which the model is fitted. That cannot be what happens with a latent variable, which is not part of the observed variables' summary statistics.Moderation is symmetric, so you could use a multigroup model, with the categorical IV as the grouping variable. Differences in the
DV~MOD
simple effects across groups would be moderation byFAC1
. Differences in theDV~1
intercepts across groups would be the simple effect ofFAC1
, which could be probed by centeringMOD
's mean at different values. Or you might be able to use theemmeans
utility in thesemTools
package to probe the interaction; see the?lavaan2emmeans
help-page examples. I suggest another possibility below.Fac12
variable redundant becauseFac2
's intercept and effects are likewise moderated byFac1
by virtue of parameters differing across groups.Yes, Mplus can simply use LMS estimation, but that is fraught with some restrictive assumptions. My student's PhD research (also here) has revealed that the product-indicator approach is less restrictive, and that can be implemented in
lavaan
(see her tutorial about using this method for invariance testing, available for download from my faculty page).