I ran the same SEM model in sem
and lavaan
. I got the same parameters and, generally, very close test values, with the exception of AIC
and BIC
which were immensely different between the two packages.
The following is the resulting AIC
and BIC
from sem
:
AIC = 2913.849
BIC = -1777.617
The following is the resulting AIC
and BIC
from lavaan
:
Akaike (AIC) 37780.878
Bayesian (BIC) 38178.999
Why is there such a huge difference in these values? Are they calculated differently in each package?
edit:
Here are pieces of code on how I got such values.
In the case of package sem
, I used the following line:
options(fit.indices = c("GFI", "AGFI", "RMSEA", "NFI", "NNFI", "CFI", "RNI", "IFI", "SRMR", "AIC", "AICc", "BIC", "CAIC"))
fit = sem(mymodel, cov(mydata), nrow(mydata), data = mydata)
summary(fit)
As for the package lavaan
, the following line was used instead:
fit = sem(mymodel, data = mydata, estimator = "ML")
summary(fit,fit.measures=TRUE)
I am using R
version 3.0.3 (2014-03-06) "Warm Puppy" (x64 version), sem
package version 3.1-3 and lavaan
package version 0.5-16 .
Best Answer
I suspected the same thing as
Stat
when I wrote my comment but I wanted to cross-check it before saying it and be sure which package does what. This is what takes place:On the one hand
lavaan
uses the classical definition of $AIC$ as $AIC = -2L(\theta) +2k$ where $k$ is your number of free parameters and $L(\theta)$ the value of the log-likelihood function at $\theta$. (I think this is the "proper" thing to do.)On the other hand
sem
one uses the approximation $AIC \approx \chi^2 + m(m + 1) - 2DF$ where $m$ is the number of variables in the model and $DF$ is the model's degrees of freedom. This is approximation omits the constant part of the log-likelihood exactly because during model comparison this part is irrelevant. It is essentially a reformulation of the residual sum of squares - based derivation of $AIC$ for Gaussian likelihoods.Both $AIC$ values should be immediately available by simply writing out the formulas shown above and replacing each variable with the realization of it in your respective model. (Model summaries have this information.)