Solved – the relationship between independent component analysis and factor analysis

factor analysisindependent component analysismultivariate analysis

I am new to Independent Component Analysis (ICA) and have just a rudimentary understanding of the the method. It seems to me that ICA is similar to Factor Analysis (FA) with one exception: ICA assumes that the observed random variables are a linear combination of independent components/factors that are non-gaussian whereas the classical FA model assumes that the observed random variables are a linear combination of correlated, gaussian components/factors.

Is the above accurate?

Best Answer

enter image description here

FA, PCA, and ICA, are all 'related', in as much as all three of them seek basis vectors that the data is projected against, such that you maximize insert-criteria-here. Think of the basis vectors as just encapsulating linear combinations.

For example, lets say your data matrix $\mathbf Z$ was a $2$ x $N$ matrix, that is, you have two random variables, and $N$ observations of them each. Then lets say you found a basis vector of $\mathbf w = \begin{bmatrix}0.1 \\-4 \end{bmatrix}$. When you extract (the first) signal, (call it the vector $\mathbf y$), it is done as so:

$$ \mathbf {y = w^{\mathrm T}Z} $$

This just means "Multiply 0.1 by the first row of your data, and subtract 4 times the second row of your data". Then this gives $\mathbf y$, which is of course a $1$ x $N$ vector that has the property that you maximized its insert-criteria-here.

So what are those criteria?

Second-Order Criteria:

In PCA, you are finding basis vectors that 'best explain' the variance of your data. The first (ie highest ranked) basis vector is going to be one that best fits all the variance from your data. The second one also has this criterion, but must be orthogonal to the first, and so on and so forth. (Turns out those basis vectors for PCA are nothing but the eigenvectors of your data's covariance matrix).

In FA, there is difference between it and PCA, because FA is generative, whereas PCA is not. I have seen FA as being described as 'PCA with noise', where the 'noise' are called 'specific factors'. All the same, the overall conclusion is that PCA and FA are based on second-order statistics, (covariance), and nothing above.

Higher Order Criteria:

In ICA, you are again finding basis vectors, but this time, you want basis vectors that give a result, such that this resulting vector is one of the independent components of the original data. You can do this by maximization of the absolute value of normalized kurtosis - a 4th order statistic. That is, you project your data on some basis vector, and measure the kurtosis of the result. You change your basis vector a little, (usually through gradient ascent), and then measure the kurtosis again, etc etc. Eventually you will happen unto a basis vector that gives you a result that has the highest possible kurtosis, and this is your independent component.

The top diagram above can help you visualize it. You can clearly see how the ICA vectors correspond to the axes of the data, (independent of each other), whereas the PCA vectors try to find directions where variance is maximized. (Somewhat like resultant).

If in the top diagram the PCA vectors look like they almost correspond to the ICA vectors, that is just coincidental. Here is another instance on different data and mixing matrix where they are very different. ;-)

enter image description here