structural-equation-modeling – Differences Between PLS Regression and PLS Path Modeling: A Critique

partial least squarespath-modelstructural-equation-modeling

This question was asked here but no one gave a good answer. So I think it's a good idea to bring it up again and also I would like to add some more comments/questions.

  • The first question is what is the difference between "PLS path modeling" and "PLS regression"? To make it more general, what are structural equation modeling (SEM), path modeling and regression? To my understanding regression focuses more on prediction while SEM focus is on the relationship between response and predictors and path modeling is a special case of SEM?

  • My second question is how trustworthy is PLS? Recently it's been subject to many criticisms as highlighted in Rönkkö et al. 2016 and Rönkkö et al. 2015 which leads to the rejection of papers based on PLS in high tier journals such as Journal of Operations Management (here is the note from the journal editor):

    We are desk rejecting practically all PLS-based manuscripts, because we have concluded that PLS has been without exception the wrong modeling approach in the kinds of models OM researchers use.

    I should note my field is spectroscopy, neither management/psychology nor statistics. In the papers linked above the authors are talking more about PLS as a SEM method, but to me, their criticism looks applicable to PLS regression as well.

Best Answer

The first question is what is the difference between "PLS path modeling" and "PLS regression"?

None, they are synonyms.

To make it more general, what are structural equation modeling (SEM), path modeling and regression? To my understanding regression focuses more on prediction while SEM focus is on the relationship between response and predictors and path modeling is a special case of SEM?

SEM is a form of regression. Regression is any method that correlates independent and dependent variables and includes methods that use multiple variables handled as separate entities. SEM specifically uses mathematical relationships between the variables to constrain the final model, in the case of PLS this is the covariance. My understanding is that path modeling is a domain- (not mine, I'm a spectroscopist like you) specific term.

My second question is how trustworthy is PLS? Recently it's been subject to many criticisms as highlighted in Rönkkö et al. 2016 and Rönkkö et al. 2015

An excellent rebuttal is found in Henseler et al. 2013 Common Beliefs and Reality About PLS. A main concern for Rönkkö et al. is that PLS didn't perform great in some situations that assume a common latent factor. PLS is in fact designed to handle multiple latent factors, a situation that is much more common in the real world.

How trustworthy? For spectroscopy it is an excellent tool but does have its limitations. It does run the risk of overfitting as it can build complex models that capture contributions from multiple underlying factors. For this reason it does need to be used with care and appropriate external validation are essential, but then these caveats apply to all model building tools. I work mainly on real world datasets for 2 decades and I have not encountered any experimental dataset that had only one common factor underpinning the dependent variable (neither based on data nor on scientific theory).