Solved – Difference in AIC and BIC values between sem and lavaan packages in R

aicbiclavaanrstructural-equation-modeling

I ran the same SEM model in sem and lavaan. I got the same parameters and, generally, very close test values, with the exception of AIC and BIC which were immensely different between the two packages.

The following is the resulting AIC and BIC from sem:

AIC =  2913.849
BIC =  -1777.617

The following is the resulting AIC and BIC from lavaan:

Akaike (AIC)                               37780.878
Bayesian (BIC)                             38178.999

Why is there such a huge difference in these values? Are they calculated differently in each package?

edit:

Here are pieces of code on how I got such values.

In the case of package sem, I used the following line:

options(fit.indices = c("GFI", "AGFI", "RMSEA", "NFI", "NNFI", "CFI", "RNI", "IFI", "SRMR", "AIC", "AICc", "BIC", "CAIC"))
fit = sem(mymodel, cov(mydata), nrow(mydata), data = mydata)
summary(fit)

As for the package lavaan, the following line was used instead:

fit = sem(mymodel, data = mydata, estimator = "ML")
summary(fit,fit.measures=TRUE)

I am using R version 3.0.3 (2014-03-06) "Warm Puppy" (x64 version), sem package version 3.1-3 and lavaan package version 0.5-16 .

Best Answer

I suspected the same thing as Stat when I wrote my comment but I wanted to cross-check it before saying it and be sure which package does what. This is what takes place:

On the one hand lavaan uses the classical definition of $AIC$ as $AIC = -2L(\theta) +2k$ where $k$ is your number of free parameters and $L(\theta)$ the value of the log-likelihood function at $\theta$. (I think this is the "proper" thing to do.)

On the other hand sem one uses the approximation $AIC \approx \chi^2 + m(m + 1) - 2DF$ where $m$ is the number of variables in the model and $DF$ is the model's degrees of freedom. This is approximation omits the constant part of the log-likelihood exactly because during model comparison this part is irrelevant. It is essentially a reformulation of the residual sum of squares - based derivation of $AIC$ for Gaussian likelihoods.

Both $AIC$ values should be immediately available by simply writing out the formulas shown above and replacing each variable with the realization of it in your respective model. (Model summaries have this information.)

Related Question