SEM – Differences Between a MIMIC Factor and a Composite with Indicators in Structural Equation Modeling

latent-variablelavaanstructural-equation-modeling

In structural equation modeling with latent variables (SEM), a common model formulation is "Multiple Indicator, Multiple Cause" (MIMIC) where a latent variable is caused by some variables and reflected by others. Here's a simple example:
enter image description here

Essentially, f1 is a regression outcome for x1, x2 and x3, and y1, y2 and y3 are measurement indicators for f1.

One can also define a composite latent variable, where the latent variable basically amounts to a weighted combination of its constituent variables.

Here's my question: is there a difference between defining f1 as a regression outcome and defining it as a composite outcome in a MIMIC model?

Some testing using lavaan software in R shows that the coefficients are identical:

library(lavaan)

# load/prep data
data <- read.table("http://www.statmodel.com/usersguide/chap5/ex5.8.dat")
names(data) <- c(paste("y", 1:6, sep=""), paste("x", 1:3, sep=""))

# model 1 - canonical mimic model (using the '~' regression operator)
model1 <- '
    f1 =~ y1 + y2 + y3
    f1 ~ x1 + x2 + x3
'

# model 2 - seemingly the same (using the '<~' composite operator)
model2 <- '
    f1 =~ y1 + y2 + y3
    f1 <~ x1 + x2 + x3
'

# run lavaan
fit1 <- sem(model1, data=data, std.lv=TRUE)
fit2 <- sem(model2, data=data, std.lv=TRUE)

# test equality - only the operators are different
all.equal(parameterEstimates(fit1), parameterEstimates(fit2))
[1] "Component “op”: 3 string mismatches"

How are these two models mathematically the same? My understanding is that regression formulas in a SEM are fundamentally different than composite formulas, but this finding seems to reject that idea. Furthermore, it's easy to come up with a model where the ~ operator is not interchangeable with the <~ operator (to use lavaan's syntax). Usually using one in place of the other results in a model identification problem, especially when the latent variable is then used in a regression different formula. So when are they interchangeable and when are they not?

Rex Kline's textbook (Principles and Practice of Structural Equation Modeling) tends to talk about MIMIC models with the terminology of composites, but Yves Rosseel, the author of lavaan, explicitly uses the regression operator in every MIMIC example I've seen.

Can somebody clarify this issue?

Best Answer

They're the same model.

It's useful to be able to define a latent variable as a composite outcome where that variable only has composite indicators.

If you don't have:

f1 =~ y1 + y2 + y3

You can't put:

f1 ~ x1 + x2 + x3

But you can have:

f1 <~ x1 + x2 + x3
Related Question