Yes, the two errors amount to the same thing. They're telling you (roughly) that two or more of your manifest variables are linearly dependent (like $y_1 = ay_2 + b$ for scalars $a, b$). These two variables (dimensions) would be "redundant", meaning that the sample covariance matrix is not invertible (ie is singular) and therefore not positive definite either.
As for what you ought to do about it, that depends. First I would try to find out which variables are giving you the trouble; a scatterplot matrix might be enough to tell you that. Then you can decide what to do from there - most likely dropping some redundant variables.
I think the best method, in your case, is to factor analyze the polychoric correlation matrix. In R, the 'psych' package allows you to perform the polychoric factor analysis (by the fa.poly
command) and also to compute the factor scores. Here the documentation and this web page may be useful.
Moreover, the 'psych' package contains the fa.parallel.poly
function, that is very useful to choose the optimal number of factors to retain by Monte Carlo simulation.
With the missing values, you can either exlude them from analysis or replace them with the mean or the median values.
Here is a recent paper that confirms the superiority of the polychoric factor analysis:
Holgado–Tello, F. C., Chacón–Moscoso, S., Barbero–García, I., & Vila–Abad E. (2010). Polychoric versus Pearson correlations in exploratory and confirmatory factor analysis of ordinal variables. Quality and quantity, 44 (1), 153-166.
In response to your second questions, principal component analysis and factor analysis are not the same thing. If your aim is to simply reduce your data, so principal component is the election technique. Otherwise, if you want to explore the underlying dimensions of your questionnaire, you have to use factor analysis. In PCA, the components derive from the variables (by maximizing the variance), while in FA are the factors that explain the variables, so the pattern is opposite. To my knowledge, this is the only important aspect to keep in consideration when you have to choose between the two methods.
fa.poly
conducts a FA, and you can specify the factoring method (GLS, WLS, PF...). If you want to conduct a PCA, I think you can use principal
, but submitting to the analysis not the raw data but the polychoric correlations matrix. Check the 'psych' documentation for these aspects, I never done a categorical principal component analysis.
Best Answer
The best treatment of this question that I have seen is a 1979 book chapter by Karl Joreskog, "Basic Ideas of Factor and Component Analysis." Sadly, I can't locate a pdf online--it is a classic for readability and succinctness.
Maximum Likelihood is just an estimation method. The real distinction is between principal components analysis (PCA) and common factor analysis (FA). PCA aims to turn p observed variables into p or fewer weighted composites, choosing each additional composite so as to explain the greatest share of variance not explained by the previous composites. Covariance is explained almost by coincidence--the focus is on the p variances.
By contrast, FA accounts for the covariance among a set of p observed variables using k < p common factors PLUS p unique factors (or "error terms"). The p unique factors fit the diagonal elements of the observed variable covariance matrix trivially, leaving zero residuals there. The common factors are chosen so as to best account for covariance among the observed variables, thus minimizing residuals for the off-diagonal elements of the covariance matrix (which are more numerous than the diagonal elements for p > 3). Combining these two features, it should not be surprising that, for the same number of common factors or principal components, the average squared residual, across the entire covariance matrix, will be smaller for a FA model than for a PCA solution--assuming that the covariance matrix is consistent with a common factor model.
It is the fundamental difference in technique, not the difference in estimation method, that is primarily responsible for this difference in performance. Maximum likelihood and generalized least squares estimation of a factor model, for example, are asymptotically equivalent, given assumptions.
Jöreskog, K. G. (1979). Basic ideas of factor and component analysis. Advances in factor analysis and structural equation models, 5-20. Abt Books.