I am trying to replicate a path analysis SEM model using Lavaan in R, and was very confused about the results that it gave regarding the model fit statistics.
The code is as follows:
#Import Package
library(lavaan)
#Input Correlation Matrix
sigma <- matrix(c(1.00, -0.03, 0.39, -0.05, -0.08,
-0.03, 1.00, 0.07, -0.23, -0.16,
0.39, 0.07, 1.00, -0.13, -0.29,
-0.05, -0.23, -0.13, 1.00, 0.34,
-0.08, -0.16 ,-0.29, 0.34, 1.00), nr=5, byrow=TRUE)
rownames(sigma) <-c("Exercise", "Hardiness", "Fitness", "Stress", "Illness")
colnames(sigma) <-c("Exercise", "Hardiness", "Fitness", "Stress", "Illness")
#Create Covariance Matrix
sdevs <-c(66.5, 3.8, 18.4, 6.7, 624.8)
covmax <- cor2cov(sigma, sdevs)
as.matrix(covmax)
#Specify Model
mymodel<-'Illness ~ Exercise + Fitness
Illness ~ Hardiness + Stress
Fitness ~ Exercise + Hardiness
Stress ~ Exercise + Hardiness + Fitness
Exercise ~~ Exercise
Hardiness ~~ Hardiness
Exercise ~~ Hardiness'
#Fit the model with the covariance matrix
N = 363
fit.path <-sem(mymodel,sample.cov=covmax, sample.nobs=N, fixed.x=FALSE)
#Summary of the model fit
summary(fit.path, fit.measures = TRUE)
And the output I get is as follows:
lavaan (0.5-12) converged normally after 93 iterations
Number of observations 37300
Estimator ML
Minimum Function Test Statistic 0.000
Degrees of freedom 0
P-value (Chi-square) 1.000
Model test baseline model:
Minimum Function Test Statistic 16594.387
Degrees of freedom 10
P-value 0.000
Full model versus baseline model:
Comparative Fit Index (CFI) 1.000
Tucker-Lewis Index (TLI) 1.000
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -882379.005
Loglikelihood unrestricted model (H1) -882379.005
Number of free parameters 15
Akaike (AIC) 1764788.009
Bayesian (BIC) 1764915.910
Sample-size adjusted Bayesian (BIC) 1764868.240
Root Mean Square Error of Approximation:
RMSEA 0.000
90 Percent Confidence Interval 0.000 0.000
P-value RMSEA <= 0.05 1.000
Standardized Root Mean Square Residual:
SRMR 0.000
Parameter estimates:
Information Expected
Standard Errors Standard
Estimate Std.err Z-value P(>|z|)
Regressions:
Illness ~
Exercise 0.318 0.048 6.640 0.000
Fitness -8.835 0.174 -50.737 0.000
Hardiness -12.146 0.793 -15.321 0.000
Stress 27.125 0.451 60.079 0.000
Fitness ~
Exercise 0.109 0.001 82.602 0.000
Hardiness 0.396 0.023 17.211 0.000
Stress ~
Exercise -0.001 0.001 -2.614 0.009
Hardiness -0.393 0.009 -44.332 0.000
Fitness -0.040 0.002 -19.953 0.000
Covariances:
Exercise ~~
Hardiness -7.581 1.309 -5.791 0.000
Variances:
Exercise 4422.131 32.381
Hardiness 14.440 0.106
Illness 318744.406 2334.012
Fitness 284.796 2.085
Stress 41.921 0.307
These are my questions:
- Why does the chi-squared say that there are no degrees of freedom?
- Why are the p-values exactly 1? Why is the CFI and TLI exactly 1?
-
Why is the RMSEA 0?
-
What would I need to do to simulate a more realistic model that doesn't appear artificially "perfect"?
- Does it have to do with the model specification?
Best Answer
It appears that this is a model where (almost) everything is regressed on everything else.
You have 5 variables in your model. That means you have 10 covariances.
You have 10 parameters.
The df of the model is equal to (number of covariances) - (number of parameters). This is zero. The model is described as saturated, and it's not testing anything. Because it's not testing anything, the fit indices are all perfect. (This will make sense if you look at the formulas for the fit indices - a zero chi-square should give you these fit indices).
What do you mean by simulate a model? If you don't want the fit to be perfect, add some constraints. Typically, one constrains to zero.
So yes, it has to do with the model specification. It's a an unusual model to test with an SEM, but if that's your model you want to test, that's your model. If you want to make it more testable, you need to add a variable which is a possible cause of one variable, but not of the others. For example, social support might influence stress, but should not (directly) incfuence ilness, and perhaps not the others. If you add social support, and put an arrow from social support ONLY to stress, you will add 6 covariances to the model, but only add 1 df. Hence your model will have 5 df, and the fit will no longer be perfect.