I heard just recently about PLS-DA and I was wondering how it differs from multinomial logistic regression, since logistic regression can be also used for categorical dependent variables.
Solved – What’s the difference between logistic regression and PLS-DA
logisticpartial least squaresregression
Best Answer
PLS-DA is closely related to LDA: for n > p the full rank PLS-DA (i.e. using all latent variables) is the same as LDA. For 1 latent variable, PLS-DA yields the same classification as closest (Euclidean) distance in feature space. I.e. the regularization "squeezes" the pooled covariance matrix into spherical shape.
A two class problem with both classes following a (multivariate) Gaussian distribution with the same covariance matrix (i.e. the situation where LDA is optimal), both LR and LDA yield the same solution.
LR will need more samples to get to the same stability, though.
In other words, there is a somewhat indirect relationship.
There are important differences between PLS-DA and LR in how they weight cases:
If you (ab)use PLS for dummy regression as it is frequently done in PLS-DA (i.e. y takes class labels encoded as 0 and 1 or equivalent encodings), PLS-DA will try to "squeeze" the within class distributions to points (as required in regression).
So if you want to use PLS for classification, make sure that it is appropriate to have all cases weighting in. Two situations where this is the case are