Solved – PLS (partial least squares) weights, loadings, and scores interpretations

partial least squaresscikit learn

In SKLearn PLSRegression, several items can be called after a model is trained:

  • Loadings
  • Scores
  • Weights
  • All the above are separated by X and Y

I intuitively understand that x_scores and y_scores should have a linear relationship because that's what the algorithm is trying to maximize.

However, despite reading multiple resources, I find that some articles use the terms loadings and weights interchangeably but I know they are different. I think loadings are the "direction" vector values that describe where each component is "pointing" at. But what about weights?

TL;DR: What's the difference between weights and loadings in SKLearn PLSRegression?

Best Answer

UPDATE:

Read on this a bit more for a project I'm working on, and I have some links to share that may be helpful. The "weights" in a PLS model are used to translate E_a (the deflated X matrices) to a column in the scores matrix t_a. Deflation occurs after each step of the algorithm by subtracting the variance accounted for by the new component. Loadings on the other hand, translate T to X.

This is a fantastic reference and goes into much more detail: https://learnche.org/pid/latent-variable-modelling/projection-to-latent-structures/how-the-pls-model-is-calculated

I also read through the plsr vignette several times. It's R but the concepts should translate: https://cran.r-project.org/web/packages/pls/vignettes/pls-manual.pdf

ORIGINAL ANSWER:

http://www.eigenvector.com/Docs/Wise_pls_properties.pdf

According to this resource, the weights are required to "maintain orthogonal scores." There are some nice visualizations starting on slide 35.