I think @Jeromy already said the essential so I shall concentrate on measures of reliability.
The Cronbach's alpha is a sample-dependent index used to ascertain a lower-bound of the reliability of an instrument. It is no more than an indicator of variance shared by all items considered in the computation of a scale score. Therefore, it should not be confused with an absolute measure of reliability, nor does it apply to a multidimensional instrument as a whole. In effect, the following assumptions are made: (a) no residual correlations, (b) items have identical loadings, and (c) the scale is unidimensional. This means that the sole case where alpha will be essentially the same as reliability is the case of uniformly high factor loadings, no error covariances, and unidimensional instrument (1). As its precision depends on the standard error of items intercorrelations it depends on the spread of item correlations, which means that alpha will reflect this range of correlations regardless of the source or sources of this particular range (e.g., measurement error or multidimensionality). This point is largely discussed in (2). It is worth noting that when alpha is 0.70, a widely refered reliability threshold for group comparison purpose (3,4), the standard error of measurement will be over half (0.55) a standard deviation.
Moreover, Cronbach alpha is a measure of internal consistency, it is not a measure of unidimensionality and can’t be used to infer unidimensionality (5). Finally, we can quote L.J. Cronbach himself,
Coefficients are a crude device that
does not bring to the surface many
subtleties implied by variance
components. In particular, the
interpretations being made in current
assessments are best evaluated through
use of a standard error of
measurement. --- Cronbach & Shavelson,
(6)
There are many other pitfalls that were largely discussed in several papers in the last 10 years (e.g., 7-10).
Guttman (1945) proposed a series of 6 so-called lambda indices to assess a similar lower bound for reliability, and Guttman's $\lambda_3$ lowest bound is strictly equivalent to Cronbach's alpha. If instead of estimating the true variance of each item as the average covariance between items we consider the amount of variance in each item that can be accounted for by the linear regression of all other items (aka, the squared multiple correlation), we get the $\lambda_6$ estimate, which might be computed for multi-scale instrument as well. More details can be found in William Revelle's forthcoming textbook, An introduction to psychometric theory with applications in R (chapter 7). (He is also the author of the psych R package.) You might be interested in reading section 7.2.5 and 7.3, in particular, as it gives an overview of alternative measures, like McDonald's $ \omega_t$ or $\omega_h$ (instead of using the squared multiple correlation, we use item uniqueness as determined from an FA model) or Revelle's $\beta$ (replace FA with hierarchical cluster analysis, for a more general discussion see (12,13)), and provide simulation-based comparison of all indices.
References
- Raykov, T. (1997). Scale reliability, Cronbach’s coefficient alpha, and violations of essential tau-equivalence for fixed congeneric components. Multivariate Behavioral Research, 32, 329-354.
- Cortina, J.M. (1993). What Is Coefficient Alpha? An Examination of Theory and Applications. Journal of Applied Psychology, 78(1), 98-104.
- Nunnally, J.C. and Bernstein, I.H. (1994). Psychometric Theory. McGraw-Hill Series in Psychology, Third edition.
- De Vaus, D. (2002). Analyzing social science data. London: Sage Publications.
- Danes, J.E. and Mann, O.K.. (1984). Unidimensional measurement and structural equation models with latent variables. Journal of Business Research, 12, 337-352.
- Cronbach, L.J. and Shavelson, R.J. (2004). My current thoughts on coefficient alpha and successorprocedures. Educational and Psychological Measurement, 64(3), 391-418.
- Schmitt, N. (1996). Uses and Abuses of Coefficient Alpha. Psychological Assessment, 8(4), 350-353.
- Iacobucci, D. and Duhachek, A. (2003). Advancing Alpha: Measuring Reliability With Confidence. Journal of Consumer Psychology, 13(4), 478-487.
- Shevlin, M., Miles, J.N.V., Davies, M.N.O., and Walker, S. (2000). Coefficient alpha: a useful indicator of reliability? Personality and Individual Differences, 28, 229-237.
- Fong, D.Y.T., Ho, S.Y., and Lam, T.H. (2010). Evaluation of internal reliability in the presence of inconsistent responses. Health and Quality of Life Outcomes, 8, 27.
- Guttman, L. (1945). A basis for analyzing test-retest reliability. Psychometrika, 10(4), 255-282.
- Zinbarg, R.E., Revelle, W., Yovel, I., and Li, W. (2005). Cronbach's $\alpha$, Revelle's $\beta$, and McDonald's $\omega_h$: Their relations with each other and two alternative conceptualizations of reliability. Psychometrika, 70(1), 123-133.
- Revelle, W. and Zinbarg, R.E. (2009) Coefficients alpha, beta, omega and the glb: comments on Sijtsma. Psychometrika, 74(1), 145-154
I don't have any citations, but here's what I'd suggest:
Zeroth: If at all possible, split the data into a training and test set.
First do EFA. Look at various solutions to see which ones make sense, based on your knowledge of the questions. You'd have to do this before Cronbach's alpha, or you won't know which items go into which factor. (Running alpha on ALL the items is probably not a good idea).
Next, run alpha and delete items that have much poorer correlations than the others in each factor. I wouldn't set an arbitrary cutoff, I'd look for ones that were much lower than the others. See if deleting those makes sense.
Finally, choose items with a variety of "difficulty" levels from IRT.
Then, if possible, redo this on the test set, but without doing any exploring. That is, see how well the result found on the training set works on the test set.
Best Answer
EFA versus PCA
In a previous question on the differences between EFA and PCA, I state:
I find that typically within the context of developing psychological scales factor analysis is more theoretically appropriate. Latent factors are often assumed to cause the observed variables.
Assessing Scale Dimensionality
Determining the dimensionality underlying a set of likert items is not just a question of EFA versus PCA. There are multiple techniques. William Revelle has some software in R for implementing several techniques (see this discussion).
In general there is rarely a definitive answer as to how many factors are required to model a set of items. If you extract more factors, you can explain more variance in the items. Of course, just by chance you might explain some variance, so some approaches try to rule out chance (e.g., the parallel test). However, even with very large samples, where chance becomes less of an explanation, I'd expect to see systematic but small increases in variance explained by extracting more factors. Thus, you are left with the issue of how much variance must be explained by the first factor relative to others in order to conclude that the scale is sufficiently unidimensional for your purpose. Such issues are closely tied to application and broader issues of validity.
You might find the following article useful to read, for a broader discussion of definitions and approaches at quantifying unidimensionality:
Hattie, J. (1985). Methodology review: Assessing unidimensionality of tests and ltems. Applied Psychological Measurement, 9(2):139.
Here's a web presentation examining a few different decision rules for defining unidimensionality