MATLAB: PCA function outputting scores different to that expected, am i missing something

pca

I have run into an issue with the pca function whereby the outputted PC1 scores are the negative to those expected. To confirm this I tried to recreate an example I found online (<http://setosa.io/ev/principal-component-analysis/)>. I have attached a MatLab file for ease of use. What I have done is:
>> [coeff,score,~,~,~] = pca(Example'); %I use the transpose of data as "Variables1" (17x1) are the variables I want to analyse.*
>> scatter(score(:,1),score(:,2));
>> text(score(:,1)+dx,score(:,2)+dy,Variables2)
*Pretty sure this is worded terribly (sorry)
Above is the the output that I am expecting to find, and below is the ouput that I am getting from the pca function. As you can see the PC2 values are the same but the PC1 values are negative of what expected (Fig above).
Why does this happen? (Not necessary but if you can word this part * better it would be much appreciated for when I have top explain it.)
Thanks in advance.

Best Answer

Hi Matteo,
I don't believe there is really a problem here. Let Vt denote the transpose of the data matrix Values, so that Vt is 4x17 like you want. With
[coef score latent] = pca(Vt)
pca computes** the eigenvalue decomposition of the covariance matrix of Vt. The resulting eigenvectors are the columns of coef. Then score is computed with
score = Vt0*coef
where Vt0 is shown in the code below. Each eigenvector is real and normalized to 1, but is still arbitrary to within an overall factor of +-1. Since all the columns of coef are orthogonal to each other, there is no foolproof way to assign those signs uniquely. It looks like the example you are using disagrees with Matlab on the overall sign of the first column of coef. So the score matrix comes up with different signs as well. Nothing wrong with that.
What matters is that you can still relate the scores to the data. No matter what the overall signs of the columns of coef are, after you calculate scores that way it should still be true that
Vt0 = score*coef'
You can change the overall signs of coef columns and make your own coef, as in the example below. The resulting scores agree with the example.
Forget about salad. After looking at the png file, this all makes me want to fly to Belfast and eat fish and chips.
load('Example.mat')
Vt = Values';
[coef score lat] = pca(Vt);
% create new coef matrix and a new score matrix
coefnew = coef;
coefnew(:,1) = -coefnew(:,1);
Vt0 = Values' - mean(Values'); % covariance matrix calculation does this anyway
scorenew = Vt0*coefnew;
figure(1);scatter(score(:,1), score(:,2))
figure(2);scatter(scorenew(:,1), scorenew(:,2)) % same as example
Vt0_check = score*coef'
Vt0_check_new = scorenew*coefnew'
max(max(abs(Vt0-Vt0_check)))
max(max(abs(Vt0-Vt0_check_new)))
** it accomplishes this more accurately using svd instead of eig but with the same intent