Solved – Regression and simple SEM: identification and model comparison

model comparisonregressionstructural-equation-modeling

I have two problems I am trying to solve. I am using MPLUS, but I have very general questions for which MPLUS knowledge will not be necessary.
Terminology: y = DV, x = IV

(1) Simple regression (N=1300)

I want to compare two linear regressions.
In the first model, a metric DV is predicted by 14 IVs as well as 2 covariates (age, sex).
In the second model, regression coefficients of the 14 IVs are constrained to be equal, only age and sex are freely estimated.

In MPLUS, a simplified example would look like this ("on" marks regressions, and the brackets set regression weights equal):

Model I:

y on x1;
y on x2;
y on x3;
...
y on sex age;

Model II:

y on x1 (1);
y on x2 (1);
y on x3 (1);
...
y on sex age;

The first regression has 0 DF. Therefor, fit cannot be assessed, which in turns makes model comparison impossible.

How do I get the first model identified so I can compare the two models? Does a linear regression always have 0 DF? I thought about perhaps constraining two of the fourteen regression weights that are very close to each other equal – is that a valid or common approach? For instance, x2 and x14 have a similar nonsignificant regression coefficient estimate of .050 which I could constraint equal to identify the model. If I do that, I get a chi-square of 0.000 for that model with 1 DF — can that be correct? The constrained model has a chi-square of 398 with 13 DF.

My next question would be how to compare these models. MPLUS outputs AIC, BIC, RMSEA, SMRM, CFI/TLI, and I understand how to interpret them generally. However, am I allowed to interpret them for a simple regression? Am I allowed to interpret them in a model that is not identified?

And as for the model test, is it correct that I simply use the chi-square values in the two models, build the difference, multiply it by 2 and look up the value in a chi-square table for the difference between the degrees of freedom of the models?

(2) SEM (N=3500)

The second problem is somewhat similar, but involves more complex models. I am new to structural equation modeling, but assume these models to be SEMs (?).

Model I consists of 9 regressions with metric DVs, each regression has the same 6 predictors. The model allows for all variables to be correlated (also y with y and x with x). The DV is a the second measurement point of y (yt2), and the regressions control for the first measurement point of (yt1). So, we're really predicting changes in y.

Model I estimates all regressions freely, while model II constrains each x to affect all y in the same way. Again, a simplified example:

Model I:

y1t2 on x1 x2 x3 y1t1;
y2t2 on x1 x2 x3 y2t2;
...
y9t2 on x1 x2 x3 y9t2;

Model II:

y1t1 on x1 x2 x3 y1t2 (1 2 3 4);
y2t1 on x1 x2 x3 y2t2 (1 2 3 5);
...
y9t1 on x1 x2 x3 y9t2 (1 2 3 6);

Is it valid to compare these two models in the way mentioned above (chi-square difference test)? Can fit indices be interpreted normally?

Model I has a chi-square of 197.293 and 72 DF, model II 617.376 and 120 DF, so that would be 120-72 DF, 2*(617.376-197.293) –> p<.001 ?

Best Answer

Regarding your first question, part 1:

Linear regression is "just-identified" in SEM. This is also called "fully-saturated."

A more simple example with 2 IVs and 1 DV gives:

3 variances and 3 covariances in the covariance matrix. This is your DF for SEM = 6.

Your regression includes 2 regression beta coefficients, 2 IV variances, 1 covariance between the IVs (you may or may not realize this is in the model, but it is), and 1 error variance = 6 parameters

6 DF = 6 Parameters

Unless constraints are made, regression models in SEM are always fully saturated and no assessment of model fit is possible.

Regarding Part 2:

I agree with Patrick that these are nested models and you can "test" the constraints with a chi2 test.