I do not have an original source, but it appears this topic is actually quite debated. Some argue that you can do parallel analysis from PCA eigenvalues when doing PAF/maximum likelihood EFA, while others suggest this is inappropriate.
B. P. O'Connor wrote the following in his macro for parallel analysis for PCA/PAF (people.ok.ubc.ca/brioconn/nfactors/nfactors.html):
Principal components eigenvalues are often used to determine the number of common factors. This is the default in most statistical software packages, and it is the primary practice in the literature. It is also the method used by many factor analysis experts, including Cattell, who often examined principal components eigenvalues in his scree plots to determine the number of common factors. But others believe this common practice is wrong. Principal components eigenvalues are based on all of the variance in correlation matrices, including both the variance that is shared among variables and the variances that are unique to the variables. In contrast, principal axis eigenvalues are based solely on the shared variance among the variables. The two procedures are qualitatively different. Some therefore claim that the eigenvalues from one extraction method should not be used to determine the number of factors for the other extraction method. The issue remains neglected and unsettled.
There are two equivalent ways to express the parallel analysis criterion. But first I need to take care of a misunderstanding prevalent in the literature.
The Misunderstanding
The so-called Kaiser rule (Kaiser didn't actually like the rule if you read his 1960 paper) eigenvalues greater than one are retained for principal component analysis. Using the so-called Kaiser rule eigenvalues greater than zero are retained for principal factor analysis/common factor anlaysis. This confusion has arisen over the years because several authors have been sloppy about using the label "factor analysis" to describe "principal component analysis," when they are not the same thing.
See Gently Clarifying the Application of Horn’s Parallel Analysis to Principal Component Analysis Versus Factor Analysis for the math of it if you need convincing on this point.
Parallel Analysis Retention Criteria
For principal component analysis based on the correlation matrix of $p$ number of variables, you have several quantities. First you have the observed eigenvalues from an eigendecomposition of the correlation matrix of your data, $\lambda_{1}, \dots, \lambda_{p}$. Second, you have the mean eigenvalues from eigendecompositions of the correlation matrices of "a large number" of random (uncorrelated) data sets of the same $n$ and $p$ as your own, $\bar{\lambda}^{\text{r}}_{1},\dots,\bar{\lambda}^{\text{r}}_{p}$.
Horn also frames his examples in terms of "sampling bias" and estimates this bias for the $q^{\text{th}}$ eigenvalue (for principal component analysis) as $\varepsilon_{q} = \bar{\lambda}^{\text{r}}_{q} - 1$. This bias can then be used to adjust observed eigenvalues thus: $\lambda^{\text{adj}}_{q} = \lambda_{q} - \varepsilon_{q}$
Given these quantities you can express the retention criterion for the $q^{\text{th}}$ observed eigenvalue of a principal component parallel analysis in two mathematically equivalent ways:
$\lambda^{\text{adj}}_{q} \left\{\begin{array}{cc}
> 1 & \text{Retain.} \\\\
\le 1 & \text{Not retain.}
\end{array}\right.$
$\lambda_{q} \left\{\begin{array}{cc}
> \bar{\lambda}^{\text{r}}_{q} & \text{Retain.} \\\\
\le \bar{\lambda}^{\text{r}}_{q} & \text{Not retain.}
\end{array}\right.$
What about for principal factor analysis/common factor analysis? Here we have to bear in mind that the bias is the corresponding mean eigenvalue: $\varepsilon_{q} = \bar{\lambda}^{\text{r}}_{q} - 0 = \bar{\lambda}^{\text{r}}_{q}$ (minus zero because the Kaiser rule for eigendecomposition of the correlation matrix with the diagonal replaced by the communalities is to retain eigenvalues greater than zero). Therefore here $\lambda^{\text{adj}}_{q} = \lambda_{q} - \bar{\lambda}^{\text{r}}_{q}$.
Therefore the retention criteria for principal factor analysis/common factor analysis ought be expressed as:
$\lambda^{\text{adj}}_{q} \left\{\begin{array}{cc}
> 0 & \text{Retain.} \\\\
\le 0 & \text{Not retain.}
\end{array}\right.$
$\lambda_{q} \left\{\begin{array}{cc}
> \bar{\lambda}^{\text{r}}_{q} & \text{Retain.} \\\\
\le \bar{\lambda}^{\text{r}}_{q} & \text{Not retain.}
\end{array}\right.$
Notice that the second form of expressing the retention criterion is consistent for both principal component analysis and common factor analysis (i.e. because the definition of $\lambda^{\text{adj}}_{q}$ changes depending on components/factors, but the second form of retention criterion is not expressed in terms of $\lambda^{\text{adj}}_{q}$).
one more thing...
Both principal component analysis and principal factor analysis/common factor analysis can be based on the covariance matrix rather than the correlation matrix. Because this changes the assumptions/definitions about the total and common variance, only the second forms of the retention criterion ought to be used when basing one's analysis on the covariance matrix.
Best Answer
You might wish to read Dinno's Gently Clarifying the Application of Horn’s Parallel Analysis to Principal Component Analysis Versus Factor Analysis. Here's a short distillation:
Principal component analysis (PCA) involves the eigen-decomposition of the correlation matrix $\mathbf{R}$ (or less commonly, the covariance matrix $\mathbf{\Sigma}$), to give eigenvectors (which are generally what the substantive interpretation of PCA is about), and eigenvalues, $\mathbf{\Lambda}$ (which are what the empirical retention decisions, like parallel analysis, are based on).
Common factor analysis (FA) involves the eigen-decomposition of the correlation matrix $\mathbf{R}$ with the diagonal elements replaced with the communalities: $\mathbf{C} = \mathbf{R} - \text{diag}(\mathbf{R}^{+})^{+}$, where $\mathbf{R}^{+}$ indicates the generalized inverse (aka Moore-Penrose inverse, or pseudo-inverse) of matrix $\mathbf{R}$, to also give eigenvectors (which are also generally what the substantive interpretation of FA is about), and eigenvalues, $\mathbf{\Lambda}$ (which, as with PCA, are what the empirical retention decisions, like parallel analysis, are based on).
The eigenvalues, $\mathbf{\Lambda} = \{\lambda_{1}, \dots, \lambda_{p}\}$ ($p$ equals the number of variables producing $\mathbf{R}$) are arranged from largest to smallest, and in a PCA based on $\mathbf{R}$ are interpreted as apportioning $p$ units of total variance under an assumption that each observed variable contributes 1 unit to the total variance. When PCA is based on $\mathbf{\Sigma}$, then each eigenvalue, $\lambda$, is interpreted as apportioning $\text{trace}(\mathbf{\Sigma})$ units of total variance under the assumption that each variable contributes the magnitude of its variance to total variance. In FA, the eigenvalues are interpreted as apportioning $< p$ units of common variance; this interpretation is problematic because eigenvalues in FA can be negative and it is difficult to know how to interpret such values either in terms of apportionment, or in terms of variance.
The parallel analysis procedure involves:
Monte Carlo parallel analysis employs a high centile (e.g. the 95$^{\text{th}}$) rather than the mean in step 4.