PCA Multicollinearity – Is PCA Unstable Under Multicollinearity?

multicollinearitypca

I know that in a regression situation, if you have a set of highly correlated variables this is usually "bad" because of the instability in the estimated coefficients (variance goes toward infinity as determinant goes towards zero).

My question is whether this "badness" persists in a PCA situation. Do the coefficients/loadings/weights/eigenvectors for any particular PC become unstable/arbitrary/non-unique as the covariance matrix becomes singular? I am particularly interested in the case where only the first principal component is retained, and all others are dismissed as "noise" or "something else" or "unimportant".

I don't think that it does, because you will just be left with a few principal components which have zero, or close to zero variance.

Easy to see this isn't the case in the simple extreme case with 2 variables – suppose they are perfectly correlated. Then the first PC will be the exact linear relationship, and the second PC will be perpindicular to the first PC, with all PC values equal to zero for all observations (i.e. zero variance). Wondering if its more general.

Best Answer

The answer might be given in even simpler terms: the multiple regression has one step more than the pca if seen in terms of linear algebra, and from the second step the instability comes into existence:

The first step of pca and mult. regression can be seen as factoring of the correlation-matrix $R$ into two cholesky factors $L \cdot L^t$ , which are triangular -and which is indifferent to low or high correlations. (The pca can then be seen as a rotation of that (triangular) cholesky-factor to pc-position (this is called Jacobi-rotation as far as I remember)

The mult. regression procedure is the to apply an inversion of that cholesky factor $L$ minus the row and column of the dependent variable, which is conveniently in the last row of the correlation-matrix.
The instability comes into play here: if the independent variables are highly correlated, then the diagonal of the cholesky factor $L$ can degenerate to very small numeric values - and to invert that introduces then the problem of division by nearly-zero.

Best Answer

Related Solutions

Solved – Raw eigenvectors or standardized eigenvectors for principal component regression

Solved – Why does Daniel Wilks (2011) say that principal component regression “will be biased”

What happens when all PCs are used?

What happens when only few PCs are used?

Why using high-variance PCs is a good idea at all?

Related Question