Solved – Different results for Singular Value Decomposition (SVD) using different tools

I am currently implementing Latent Semantic Analysis in Java using the EJML library for the preliminary Singular Value Decomposition (SVD). I am testing my code against the original term frequency matrix provided in an oldish introductory paper by Landauer et al.:
http://lsa.colorado.edu/papers/dp1.LSAintro.pdf

Strangely enough, I am getting slightly different results in the U matrix of the SVD compared to the results in the paper. Columns 2, 6, 8 and 9 are the negative of the results in the paper.
Even stranger, I am getting yet another different results (compared to both EJML and the paper) when using GNU Octave and an online tool (http://www.bluebit.gr/matrix-calculator/calculate.aspx). In this latter case, columns 1, 7, 8, and 9 are the negative of the results in the paper.

The exact values are visible here:

Results image http://postimg.org/image/c5kkrk5h5/

(note that U is called W in the paper)

This is the original term frequency matrix (tab delimited):

1   0   0   1   0   0   0   0   0
1   0   1   0   0   0   0   0   0
1   1   0   0   0   0   0   0   0
0   1   1   0   1   0   0   0   0
0   1   1   2   0   0   0   0   0
0   1   0   0   1   0   0   0   0
0   1   0   0   1   0   0   0   0
0   0   1   1   0   0   0   0   0
0   1   0   0   0   0   0   0   1
0   0   0   0   0   1   1   1   0
0   0   0   0   0   0   1   1   1
0   0   0   0   0   0   0   1   1

It would be interesting to compare this to other tools and languages (R, Matlab …) to see what results these would yield. So, if you have time to run the SVD in a different environment, it would be great, if you could post the results here.

Would anyone have any idea where these differences might come from?

Thanks a lot.

Cheers,

Martin

Best Answer

SVD decomposition of a mtrix is in general not unique. If U and V are a decomposition of a matrix, then for any diagonal matrix with only -1 or 1 as diagonal elements, UM and M^TV will also be valid SVD decomposition. Different implementations maybe just randomly choose one or the other one. The PCA decomposition has the same issue: only the orientations but not the directions of the eigenvectors (singular vectors) are unique.

Best Answer

Related Solutions

Solved – When to choose PCA vs. LSA/LSI

Solved – LSA vs. PCA (document clustering)

Related Question