MATLAB: Pca() output: which weight belongs to which factor

factor weightpca

Hi all,
I have a matrix with 14 variables and 113 observations. In the end, I would like to know something like 'variable 4 accounts for 60% of the variance in my data, variable 3 for 26% etc.'. I think the command I need is pca(), but I don't really understand how to get from Matlab's output to 'this variable explains this much of much data'. Can anyone help? 🙂

Best Answer

But that is NOT what you get from PCA. PCA tells you that some linear combination of the variables is important, or not. It does NOT tell you how much variance can be attributed to any single original variable. Sorry, but I think you need to do some reading about PCA, and about statistics in general. If that is your goal, then you don't need PCA.
In fact, in general IF you were to do the computation that you seem to be asking for, thus computing the amount of variance that could be attributed to each variable in your model? Then as long as the different variables had any non-zero correlations at all, then the total amount of the percentages you arrive at will always be greater than 100%. So the total of your variables explains more than 100% of the variance in your data.
In fact, for a quick computation on a completely random matrix.
A = randn(113,14);
I found that the 14 variables in this completely random problem explain in total something like 272% of the total variation in this dataset, not a cumulative 100%. But 272%. And since we know the data was completely random, that seems a paradox. The real paradox is that what you want to do is not really meaningful.