Solved – What’s the intuition behind Velicer’s minimum average partial (MAP) test

factor analysispartial-correlationpca

In my field both the parallel test and Velicer's minimum average partial (MAP) test are commonly used when researchers are considering how many factors to retain in factor analysis or components to retain in PCA.

I find the parallel test reasonably intuitive – we want factors/components that account for more variance than factors/components derived from random data.

As I understand it, in the MAP test we do a PCA/factor analysis with only one factor/component, and then partial that PC/factor out of the correlations between the observed variables, and then we calculate the average squared partial correlation coefficient from the off-diagonals of the partial correlation matrix. Then we repeat the process with two factors/components, and so on. Then we look for the number of factors/components that results in the lowest average squared correlation coefficient.

On some level I think I understand the mechanics of the process described above, but I don't understand how it relates to selecting an appropriate number of factors/principal components. The original paper is short and fairly readable, but I did not feel after reading it that I understood the concept intuitively.

Velicer, W. F. (1976). Determining the number of components from the matrix of partial correlations. Psychometrika, 41(3), 321-327.

Best Answer

I think the intuition behind MAP can be grasped by looking at the formula of partial correlation, included in Velicer (1976) paper (equation 11), which I also write here for convenience:

$$r_{ij.y} = \frac{r_{ij}-r_{iy}r_{jy}}{((1-r_{iy}^2)(1-r_{jy}^2))^{1/2}}$$

At the numerator you have the partial covariance between each pair of variables $i$ and $j$; this number will go down as you partial out more components, since you are removing systematic variance. You divide this number to get a normalization, pretty much like you divide covariance by the product of the standard deviations in order to have a correlation coefficient that has value between -1 and +1. This denominator contains the two correlation terms between each of the two variables and the component $y$ that you are removing. These correlation terms will go up as you keep removing components, since components will contain more and more individual variability/noise. This makes the denominator as a whole to go down. So, both the numerator and the denominator go down as you remove more components, the former because partial covariance goes down since you are removing common/systematic variability, the latter because components catch more and more individual variability.

Now, it comes a point at which the denominator starts decreasing faster than the numerator, because you are removing more individual variability than systematic variability in the data. This makes the partial correlation go up. The MAP criterion computes the average of the partial correlations, and tells you to stop when the average partial correlation stops going down and starts going up, i.e. when you are starting the remove more individual variability than common variability in the data.