I'm running CFA on AMOS for an attitude scale and I got a good model fit after deleting three problematic items (less than .30 factor loadings). However, my data is not normally distributed. I was reading other threads that say Likert-based data are ordinal, hence they can't be normal, but I wonder if what I did is wrong? Should I re-run my analysis after transforming my data? Many thanks for your help!!
Solved – Normality for Confirmatory analysis with Likert scale data
confirmatory-factorlikertnormality-assumption
Related Solutions
Yes, it is possible to supply each item with its own weight. This weight, however, cannot be the loading itself because - you might remember - loading is a regression coefficient of a factor in predicting an item, not vice versa. The weight you imply must be a regression coefficient of an item in predicting a factor. We obtain those weights when we compute factore scores; the weights $\mathbf{B}$ are estimated from inter-item correlation (or covariance) matrix $\mathbf{R}$ and loadings matrix $\mathbf{A}$ typically this way: $\mathbf{B}= \mathbf{R}^{-1} \mathbf{A}$. (If factors were obliquely rotated then in this formula factor structure matrix should replace $\mathbf{A}$.) See also, where coarse and refined methods are considered; coarse method permits using loadings as weights.
If so, then why do researchers generally use Likert scales with equally weighted items? In other words, why do they often prefer just binary weights 1 or 0 in place of the above computed fractional weights? There may be several reasons. To mention just three... First, the above weights $\mathbf{B}$ are not precise (unless we used PCA model rather than factor analysis model per se) due to the fact that the uniqness of an item is not known on the level of each case (respondent), and thereby computed factor scores are only approximation of true factor values. Second, computed weights $\mathbf{B}$ usually will vary from sample to sample and eventually they show not much better than simply 1 vs 0 weights. Third, the weighted-sum-model behind a summative (Likert) construct is a simplification in principle. It implies that the trait that is measured by the scale depends on all its items simultaneously whatever its pronouncedness. But we know that many traits behave differently. For instance, when a trait is weak, it may show only a subset of symptoms (i.e. items), but those expressed in full; as the trait grows stronger, more symptoms join in, some partly expressed, some expressed in full and even replacing those "older" symptoms. This dynamic and unpredictive internal growths of a trait cannot be modeled by weighted linear combination of its phenomena. In this situation, using fine fractional weights is in no way better than using binary 0-1 weights.
I assume that you are thinking of a simple structure in which each of the 20 items loads on exactly 1 factor. Suppose items 1-10 load on factor 1, and 11-20 load on factor 2. Then you could average items 1-10, average items 11-20 for each individual and calculate their correlation. Alternatively, you can estimate factor scores for the factors and obtain an estimate of the correlation that way. If you do a CFA, allowing the correlation between factors to be free, the software will estimate that parameter for you. Is this your question?
So yes ... these two statistics will be different. You can think of each item as being a noisy estimator of factor 1 or factor 2 (as appropriate). Taking the average will reduce the noise, but you still have noisy observations. Adding noise to a pair of variables reduces their correlation, so the first statistic will be biased downwards as an estimate of the correlation you seek. This is true even if the factor loadings are the same.
In confirmatory factor analysis, you estimate the various components of the model (uniqueness variances, loadings, factor covariances) through maximum likelihood (or some other method), so you end up actually estimating the parameter of interest (the factor correlation).
As a bonus, you can still get the covariance of the factors in a more complex model, where items load on more than 1 factor. Sum scores would totally not work in that case, but the covariance of the factors will emerge from the optimization.
Best Answer
Treating ordinal data as continuous is often a reasonable approximation, there are a few papers on this:
http://psycnet.apa.org/journals/met/9/4/466/
http://rd.springer.com/article/10.1007/s11135-008-9190-y
http://www.unc.edu/~curran/pdfs/Curran,West%26Finch(1996).pdf
Go read some papers that use CFA and you'll see that treating ordinal data as continuous is common (you'll even find some by me :). One argument for this is that this is how the instrument is used - people will sum the scores and treat those sums as a continuous scale, you can therefore translate between the meaning of your CFA and the meaning of the total score that is used.
However, if you want to address the issue you can't transform the data (well, you can, but you can't magically make the data non-ordinal). But you can analyze it more appropriately.
There are two issues: first is the parameter estimates, and particularly the standard errors of the parameter estimates, you can solve this with bootstrapping, which is straightforward in Amos. Second is the fit indices, that's not so easily solved.
It's possible using a Bayesian approach in Amos (I believe), or you an also treat the data as ordinal, in paid programs like Mplus, Lisrel, or EQS, gsem (in Stata) or free programs like Lavaan (in R), sem (in R) and Mx (stand-alone, and an R package).
Amos isn't up to the task. It used to be the case , many years ago, that Amos was the most cutting edge program, but it has barely advanced in the past 10-15 years, when other programs have overtaken it. I'm not sure why that is.