Solved – Best fitting model – AIC or CFI/TLI/RMSEA

aicconfirmatory-factorfactor analysisstructural-equation-modeling

For my thesis I am conducting a factor analysis of a Belgian personality questionnaire, using the lavaan package for R. I have applied a split-sample procedure, and use sample 1 for exploratory factor analysis (EFA), and sample 2 for confirmatory factor analysis (CFA). Both samples have N > 300. All models below have a two-factor solution.

However, when I test the model that I identified in EFA (sample 1) using CFA (sample 2), I get a poor fitting model (low CFI & TLI, high RMSEA). On the other hand, several modification indices are suggested that seem to make sense (i.e., let several error terms of similar items correlate), after which I do get a good fitting adjusted model. This is indicated by a CFI and TLI of .95, and an RMSEA of .05.

Now, my main problem is that when I compare the fit of my Belgian model with the fit of a previously identified factor structure of the English version of the questionnaire (tested on the same data), that the AIC of this English model is lower (7500ish vs 8500ish) than my adjusted model. This makes sense as my new model is more complex and AIC controls for this. However, the overall fit of the English model is not very high (CFI/TLI .89ish, RMSEA .07ish). Thus, although the English model has a lower AIC and thus is the preferred model, it fits rather poorly on the current data.

I want to use the factors (subscales) I identified in a future study in which I relate these to behavioral measures, but I don't have time for another factor analysis to test any model of the questionnaire again. Should I continue using my own Belgian model/subscales, or is it better to use the English model/subscales instead? Also, is it commonly accepted to use subscales from a questionnaire from a different language if this better fits the data than native-language questionnaires?

I hope this somewhat makes sense – thanks for any replies!

Sjon

Best Answer

AIC can't be used to to compare models fitted on different datasets, so comparing your model to that of the English model makes no sense whatsoever. So, there is no justification at all to abandon your model and favour the English one on this basis.

I would also advise a little caution in the use of modification indices, which allow the analyst to obtain a better fitting model without thinking. However, it does appear from what you said that these residual error covariances are justified in your case, since you say that they are similar items.

Related Question