Solved – PCA on returns or levels

MATLABpca

I am doing a PCA on a few economic indices (which I standardized) (24 timeseries). When doing the PCA on the values/level of the index, I get rather similar loadings on each index, which I think is good, as no one should dominate. However when I run it on the returns of these standardized indices, the loadings are much more extreme on 1 or 2 and almost none for the rest.

Also when doing PCA on both levels and returns, I only get non negligible loadings on the levels!

What is the interpretation of this, how are these different results possible?

Additional info: I am doing it in matlab with the princomp function

Best Answer

The most common approach is to apply PCA to the covariance (or correlation) matrix of the log changes in the data. It would not matter in this case if you indexed the data. In this case, you would be attempting to find the eigenvectors that explain the most of the variance of the underlying data. The purpose of this dimension reduction is usually to focus on the key factors and improve estimation of the covariance matrix.

If you are dealing with data that has very different variances, then it is possible for the first eigenvalue to be dominated by the high variance variables. This is especially true if you are working with some data that you should take the log of (like equities or GDP) and data that you probably shouldn't take the log of (some confidence indicators that are bound between 0 and 100, interest rates). In this case, it can be helpful to take the PCA on the correlation matrix, which removes the impact of differing variances between variables. However, care should be taken when constructing factors based on these eigenvectors (instead of multiplying the returns by the eigenvectors, you would multiply the z-score of the returns by the eigenvectors to obtain uncorrelated factors).

That's not to say that it is unreasonable to apply PCA to levels. It is possible to apply PCA on levels and consider the eigenvectors as potentially cointegrating vectors. So you can construct factors and apply cointegration tests to the factors and include them in an ECM if significant.

So it depends on what you're trying to accomplish, but the most people apply PCA to the returns.