Solved – Combining principal component analysis and partial least squares

partial least squarespca

I know PCA and PLS are considered as alternative method to each other. But I am thinking about a kind of combination of the two in case of lots of predictors with little variability.

In that case, when I run 1-component PLS with original predictors, it does not produce a meaningful model in terms of prediction. But if I first compute 10-20 PCA components and run 1-component PLS with those PC scores as predictors, practically, the model is quite good in terms of prediction power. But I would like to know why.

Can anybody explain why this is better than 1-component PLS with original predictors?

Best Answer

The reason why 1-component PLS looses to 1-component PLS based on first 10-20 PCA components is that, in the former case, the correlation structure between the original variables is partially lost (since the 1-component PLS direction is chosen only based on the correlations between the original input variables and the outcome variable). In the latter case though all PCA variables are not correlated to each other which makes 1-component PLS equivalent to least squares regression (OLS) based on the PCA variables (i.e. no information is lost apart from loosing the rest of the PCA directions which are not very informative anyway)

In any case, from predictive power perspective, it does not really make sense to limit the number of components in PLS (or PCR) to one - instead the number of components in PLS/PCR should be treated as a "meta" parameter which should be tuned using resampling (e.g. using package caret which provides a really nice harness for that)

Related Question