[Math] Projection onto Singular Vector Subspace for Singular Value Decomposition

eigenvalues-eigenvectorslinear algebra

I am not very sophisticated in my linear algebra, so please excuse any messiness in my terminology, etc.

I am attempting to reduce the dimensionality of a dataset using Singular Value Decomposition, and I am having a little trouble figuring out how this should be done. I have found a lot of material about reducing the rank of a matrix, but not reducing its dimensions.

For instance, if I decompose using SVD: $A = USV^T$, I can reduce the rank of the matrix by eliminating singular values below a certain threshold and their corresponding vectors. However, doing this returns a matrix of the same dimensions of $A$, albeit of a lower rank.

What I actually want is to be able to express all of the rows of the matrix in terms of the top principal components (so an original 100×80 matrix becomes a 100×5 matrix, for example). This way, when I calculate distance measures between rows (cosine similarity, Euclidean distance), the distances will be in this reduced dimension space.

My initial take is to multiply the original data by the singular vectors: $AV_k$. Since $V$ represents the row space of $A$, I interpret this as projecting the original data into a subspace of the first $k$ singular vectors of the SVD, which I believe is what I want.

Am I off base here? Any suggestions on how to approach this problem differently?

Best Answer

If you want to do the 100X80 to 100X5 conversion, you can just multiply U with the reduced S (after eliminating low singular values). What you will be left with is a 100X80 matrix, but the last 75 columns are 0 (provided your singular value threshold left you with only 5 values). You can just eliminate the columns of 0 and you will be left with 100X5 representation.

The above 100x5 Matrix can be multiplied with the 5X80 matrix obtained by removing the last 75 rows of V transpose, this results in the approximation of A that you are effectively using.

Related Question