Solved – Beginner references to understand probabilistic principal component analysis (PPCA)

latent-variablemachine learningpcareferences

I am totally new to machine learning. I started studying PCA from Jonathan Shlens, 2005, A Tutorial on Principal Component Analysis. The paper provides some concrete examples, and background mathematics so anyone can take the paper and understand PCA thoroughly. This paper is simply amazing.

I am interested in understanding probabilistic model used for getting PCA (i.e. PPCA) and was wondering if anyone can give similar accessible materials that are self contained. The papers I got from internet are not easy to follow, especially because I don't know most of the statistical references they give in the paper.

Can anyone give me a list of things I should know to understand PPCA and/or a proper book reference?

Best Answer

PPCA was introduced in Tipping & Bishop, 1999, Probabilistic Principal Component Analysis. I would say that this paper itself is one of the best references: it is concise and clear.

Nevertheless, it might be difficult for a beginner. If so, you can try Bishop's textbook Pattern Recognition and Machine Learning, which is excellent and includes a thorough discussion of PPCA in Chapter 12. In order to prepare for this chapter, one would need to have some understanding of basic probability theory (Chapter 1), multivariate Gaussian distribution (Chapter 2), and expectation-maximization algorithm (Chapter 9). The entire book is freely available online in PDF.

Related Solutions

Solved – Validating principal component analysis

As others have told you PCA does not look for amplitude - in fact it is standard procedure to normalize your variables before a PCA. You did not do this by the way. It looks for correlations between the columns.

The result you want to generate you would get by

Randomly generating a column
Generating a second random column with similar parameters but also adding the first column to it. In your example this would basically be first column + randbetween.
Generate additional uncorrelated columns as in 1
Normalize and then get eigenvalues and vectors

Machine Learning – What Is Principal Subspace in Probabilistic PCA?

This is an excellent question.

Probabilistic PCA (PPCA) is the following latent variable model \begin{align} \mathbf z &\sim \mathcal N(\mathbf 0, \mathbf I) \\ \mathbf x &\sim \mathcal N(\mathbf W \mathbf z + \boldsymbol \mu, \sigma^2 \mathbf I), \end{align} where $\mathbf x\in\mathbb R^p$ is one observation and $\mathbf z\in\mathbb R^q$ is a latent variable vector; usually $q\ll p$. Note that this differs from factor analysis in only one little detail: error covariance structure in PPCA is $\sigma^2 \mathbf I$ and in FA it is an arbitrary diagonal matrix $\boldsymbol \Psi$.

Tipping & Bishop, 1999, Probabilistic Principal Component Analysis prove the following theorem: the maximum likelihood solution for PPCA can be obtained analytically and is given by (Eq. 7): $$\mathbf W_\mathrm{ML} = \mathbf U_q (\boldsymbol \Lambda_q - \sigma_\mathrm{ML}^2 \mathbf I)^{1/2} \mathbf R,$$ where $\mathbf U_q$ is a matrix of $q$ leading principal directions (eigenvectors of the covariance matrix), $\boldsymbol \Lambda_q$ is the diagonal matrix of corresponding eigenvalues, $\sigma_\mathrm{ML}^2$ is also given by an explicit formula, and $\mathbf R$ is an arbitrary $q\times q$ rotation matrix (corresponding to rotations in the latent space).

The ppca() function implements expectation-maximization algorithm to fit the model, but we know that it must converge to the $\mathbf W_\mathrm{ML}$ as given above.

Your question is: how to get $\mathbf U_q$ if you know $\mathbf W_\mathrm{ML}$.

The answer is that you can simply use singular value decomposition of $\mathbf W_\mathrm{ML}$. The formula above is already of the form orthogonal matrix times diagonal matrix times orthogonal matrix, so it gives the SVD, and as it is unique, you will get $\mathbf U_q$ as left singular vectors of $\mathbf W_\mathrm{ML}$.

That is exactly what Matlab's ppca() function is doing in line 305:

% Orthogonalize W to the standard PCA subspace
[coeff,~] = svd(W,'econ');

Can I assume principal subspace is spanned only by a unique set of orthonormal vectors?

No! There is an infinite number of orthogonal bases spanning the same principal subspace. If you apply some arbitrary orthogonalization process to $\mathbf W_\mathrm{ML}$ you are not guaranteed to obtain $\mathbf U_q$. But if you use SVD or something equivalent, then it will work.

Best Answer

Related Solutions

Solved – Validating principal component analysis

Machine Learning – What Is Principal Subspace in Probabilistic PCA?

Related Question