MATLAB: How to apply PCA correctly

dimension reductionpcaprincipal component analysisStatistics and Machine Learning Toolbox

Hello
I'm currently struggling with PCA and Matlab. Let's say we have a data matrix X and a response y (classification task). X consists of 12 rows and 4 columns. The rows are the data points, the columns are the predictors (features).
Now, I can do PCA with the following command:
[coeff, score] = pca(X);
As I understood from the matlab documentation, coeff contains the loadings and score contains the principal components in the columns. That mean first column of score contains the first principal component (associated with the highest variance) and the first column of coeff contains the loadings for the first principal component.
Is this correct?
But if this is correct, why is then X * coeff not equal to score?

Best Answer

Maybe this script will help.
rng 'default'
M = 7; % Number of observations
N = 5; % Number of variables observed
X = rand(M,N);
% De-mean
X = bsxfun(@minus,X,mean(X));
% Do the PCA
[coeff,score,latent] = pca(X);
% Calculate eigenvalues and eigenvectors of the covariance matrix
covarianceMatrix = cov(X);
[V,D] = eig(covarianceMatrix);
% "coeff" are the principal component vectors.
% These are the eigenvectors of the covariance matrix.
% Compare the columns of coeff and V.
% (Note that the columns are not necessarily in the same *order*,
% and they might be *lightly different from each other
% due to floating-point error.)
coeff
V
% Multiply the original data by the principal component vectors
% to get the projections of the original data on the
% principal component vector space. This is also the output "score".
% Compare ...
dataInPrincipalComponentSpace = X*coeff
score
% The columns of X*coeff are orthogonal to each other. This is shown with ...
corrcoef(dataInPrincipalComponentSpace)
% The variances of these vectors are the eigenvalues of the covariance matrix, and are also the output "latent". Compare
% these three outputs
var(dataInPrincipalComponentSpace)'
latent
sort(diag(D),'descend')