I have two problems I am trying to solve. I am using MPLUS, but I have very general questions for which MPLUS knowledge will not be necessary.
Terminology: y = DV, x = IV
(1) Simple regression (N=1300)
I want to compare two linear regressions.
In the first model, a metric DV is predicted by 14 IVs as well as 2 covariates (age, sex).
In the second model, regression coefficients of the 14 IVs are constrained to be equal, only age and sex are freely estimated.
In MPLUS, a simplified example would look like this ("on" marks regressions, and the brackets set regression weights equal):
Model I:
y on x1;
y on x2;
y on x3;
...
y on sex age;
Model II:
y on x1 (1);
y on x2 (1);
y on x3 (1);
...
y on sex age;
The first regression has 0 DF. Therefor, fit cannot be assessed, which in turns makes model comparison impossible.
How do I get the first model identified so I can compare the two models? Does a linear regression always have 0 DF? I thought about perhaps constraining two of the fourteen regression weights that are very close to each other equal – is that a valid or common approach? For instance, x2 and x14 have a similar nonsignificant regression coefficient estimate of .050 which I could constraint equal to identify the model. If I do that, I get a chi-square of 0.000 for that model with 1 DF — can that be correct? The constrained model has a chi-square of 398 with 13 DF.
My next question would be how to compare these models. MPLUS outputs AIC, BIC, RMSEA, SMRM, CFI/TLI, and I understand how to interpret them generally. However, am I allowed to interpret them for a simple regression? Am I allowed to interpret them in a model that is not identified?
And as for the model test, is it correct that I simply use the chi-square values in the two models, build the difference, multiply it by 2 and look up the value in a chi-square table for the difference between the degrees of freedom of the models?
(2) SEM (N=3500)
The second problem is somewhat similar, but involves more complex models. I am new to structural equation modeling, but assume these models to be SEMs (?).
Model I consists of 9 regressions with metric DVs, each regression has the same 6 predictors. The model allows for all variables to be correlated (also y with y and x with x). The DV is a the second measurement point of y (yt2), and the regressions control for the first measurement point of (yt1). So, we're really predicting changes in y.
Model I estimates all regressions freely, while model II constrains each x to affect all y in the same way. Again, a simplified example:
Model I:
y1t2 on x1 x2 x3 y1t1;
y2t2 on x1 x2 x3 y2t2;
...
y9t2 on x1 x2 x3 y9t2;
Model II:
y1t1 on x1 x2 x3 y1t2 (1 2 3 4);
y2t1 on x1 x2 x3 y2t2 (1 2 3 5);
...
y9t1 on x1 x2 x3 y9t2 (1 2 3 6);
Is it valid to compare these two models in the way mentioned above (chi-square difference test)? Can fit indices be interpreted normally?
Model I has a chi-square of 197.293 and 72 DF, model II 617.376 and 120 DF, so that would be 120-72 DF, 2*(617.376-197.293) –> p<.001 ?
Best Answer
Regarding your first question, part 1:
Linear regression is "just-identified" in SEM. This is also called "fully-saturated."
A more simple example with 2 IVs and 1 DV gives:
3 variances and 3 covariances in the covariance matrix. This is your DF for SEM = 6.
Your regression includes 2 regression beta coefficients, 2 IV variances, 1 covariance between the IVs (you may or may not realize this is in the model, but it is), and 1 error variance = 6 parameters
6 DF = 6 Parameters
Unless constraints are made, regression models in SEM are always fully saturated and no assessment of model fit is possible.
Regarding Part 2:
I agree with Patrick that these are nested models and you can "test" the constraints with a chi2 test.