Solved – LDA, PCA and k-means: how are they related

discriminant analysisk-meansmultivariate analysispca

I am trying to understand how linear discriminant analysis (LDA) is related to principal component analysis (PCA) and k-means clustering method. As an example, here is a comparison between PCA and k-means:

enter image description here

My question is how LDA is related to PCA and k-means?

Best Answer

I'm by no means an expert in the topic, but it seems that K-means clustering can be viewed as a dimensionality reduction technique, of which LDA and PCA are direct examples. Clustering via K-means seems to uncover the latent structure of data, which essentially results in dimensionality reduction. I'm sure that other people will provide some more advanced answers to this question.

Additionally, I would like to share two references that are relevant to the question/topic and IMHO are rather comprehensive. One reference is a highly-cited research paper by Ding and He (2004) on the relationship between K-means and PCA techniques. Another reference is a research paper by Martinez and Kak (2001), presenting the comparison between PCA and LDA techniques.

References

Ding, C., & He, X. (2004, July). K-means clustering via principal component analysis. In Proceedings of the twenty-first International Conference on Machine Learning (p. 29). ACM.

Martínez, A. M., & Kak, A. C. (2001). PCA versus LDA. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(2), 228-233.