Hi
I am currently trying to use classification analysis for some EEG data. As such data is of very high dimensionality, I am looking at using PCA for dimensionality reduction to prevent overfitting of the classification models. My data structure is approximately 50 (rows, observations) times 38000 (columns, variables). I used the Matlab ‘pca’ function to generate principal components from my variables. I have three questions about this.
First, as stated on the Mathworks website (https://uk.mathworks.com/help/stats/pca.html), rows of the input matrix X should correspond to observations and columns to variables, which is the case for my approach. However, the number of principle components is always equal to rows/observations-1 (I tried using different numbers of rows). Why is this the case? Should it be this way? To me, it would be more intuitive if the number of (maximal) components would be equal to columns/variables-1.
Also, I observed that the sum of the output variable ‘explained’ is always 100, whether I have 5 or 50 principle components. Am I right to assume that this variable therefore does not refer to the proportion of the original data’s variance explained by the principle components but rather reflects the spread of ‘principle component’ variance across individual components? How can I find out the former? That is, how much of my data’s variance is included in the resulting principle components? Or do principle components always reflect the whole variance, no matter how few they might be?
Finally, I understand the ‘scores’ variable so that it reflects my data’s variance, meaning that it can be used analogously to my original data’s variables (e.g. columns). Is this right? Or do I have to project my data back to the original axes after performing PCA and using only a subset of the components? If so, how do I then even reduce input dimensions? I tried ‘reversing’ PCA and I received the same number of variables as before, just with different values in the matrix.
Hope these questions are reasonable and I appreciate any help you can offer. Unfortunately, I was not able to find answers researching the web.
Best wishes.
Best Answer