Solved – Is a multigroup confirmatory factor analysis appropriate for comparing the measurement model across two groups

confirmatory-factorfactor analysisstructural-equation-modeling

Background: Before commencing treatment 180 participants completed a baseline 17-item Hamilton depression rating scale (HDRS), which is a likert scale. Because of the treatment, half developed depression and the other half did not. So i have two groups; depressed and non-depressed.

Analysis: I want to know whether the factor structure of baseline HDRS differs between the two groups. Thus, I performed multiple group confirmatory factor analysis (CFA).
I used a theoretically validated and relevant HDRS factor model as the basis of the MGCFA to compare goodness of fit measures and determine measurement invariances between the groups. However, my initial CFA indicated a poor model fit for both my groups. I am mindful that my sample sizes may be too small for this.

Questions

  • Is multigroup CFA appropriate for comparing the measurement model of my two groups?
  • What should I do given the poor model fit of the CFA in both groups?
  • Is a total sample size of 180 too small for this multiple group CFA?

Best Answer

In general, multiple group CFA is a good tool for comparing the equivalence of a measurement model across two groups. However, you first need to show that the measurement model makes sense in at least one group.

First, analyse the entire sample

There are many different ways of tackling factor exploration. Here are some thoughts

  • Consider analysing the factor structure of the entire sample of 180.
  • If the confirmatory CFA is giving poor fit, you might want to run an exploratory factor analysis to see what is going on in the data. Are there items loading on factors other than theorised? Do the number of factors theorised seem reasonable relative to factor structures you get with a few more or a few less factors?
  • Also, it is not uncommon for item-level CFAs with large numbers of items per factor to yield relatively poor fit. This can be explained both in terms of how we think about fit statistics in such contexts, and also in terms of the simplicity of many measurement models. For example, it is quite common for there to be a number of systematic deviations from the idealised structure (e.g., items with common words correlating more highly; items close together correlating more highly; items that both negatively worded correlating more highly; etc.). Without incorporating these, dare I say, nuisance characteristics, fit will often be poor even when the theorised structure is a reasonable approximation.
  • Some people adopt an item parcelling approach to CFA that often smooths out some of these low-level item characteristic effects. Note that there is some controversy over the appropriateness of this approach. Item parcelling is often appealing to researchers, perhaps wrongly, because it can improve fit statistics to satisfy conventional criteria and therefore increase the ability to get published. A more reasonable justification for item-parcelling is in situations where you are not interested in the item-level and are only interested in more general characteristics of the scale. Anyway, in your case, you might want to examine item parcelling as it also reduces the number of parameters that need to be estimated.
  • With regards to sample size, I agree that more would be better, but my sense is that you'll still be able to get interesting results with 180. You can use fit statistics with confidence intervals to quantify associated uncertainty due to the smaller sample size.

Multiple group CFA

So, if, and only if, you can get a good model at the overall level would I proceed to multilple group CFA.

Related Question