Solved – Beginner references to understand probabilistic principal component analysis (PPCA)

latent-variablemachine learningpcareferences

I am totally new to machine learning. I started studying PCA from Jonathan Shlens, 2005, A Tutorial on Principal Component Analysis. The paper provides some concrete examples, and background mathematics so anyone can take the paper and understand PCA thoroughly. This paper is simply amazing.

I am interested in understanding probabilistic model used for getting PCA (i.e. PPCA) and was wondering if anyone can give similar accessible materials that are self contained. The papers I got from internet are not easy to follow, especially because I don't know most of the statistical references they give in the paper.

Can anyone give me a list of things I should know to understand PPCA and/or a proper book reference?

Best Answer

PPCA was introduced in Tipping & Bishop, 1999, Probabilistic Principal Component Analysis. I would say that this paper itself is one of the best references: it is concise and clear.

Nevertheless, it might be difficult for a beginner. If so, you can try Bishop's textbook Pattern Recognition and Machine Learning, which is excellent and includes a thorough discussion of PPCA in Chapter 12. In order to prepare for this chapter, one would need to have some understanding of basic probability theory (Chapter 1), multivariate Gaussian distribution (Chapter 2), and expectation-maximization algorithm (Chapter 9). The entire book is freely available online in PDF.