Solved – Interpretation of matrix factorization results

feature selectionmatrix decompositionrecommender-system

Matrix factorization methods are known to give good results pertaining to problems like movie recommendation. The method reduces the feature space, which is then used for recommendations.

For example consider user item matrix where each element is rating by a user for a product, whose dimension is (lets say) 1000 by 20000. We can apply matrix factorization to this matrix with latent feature size=10. This will result in user latent feature matrix P of size 1000 by 10, and item latent feature matrix Q of size 10 by 20000. Each row of P would represent the strength of the associations between a user and the features. Similarly, each row of Q would represent the strength of the associations between an item and the features.

How do we interpret this reduced latent feature space? What is the relation between reduced latent feature space and actual feature space?

Best Answer

Matrix factorization is widely used for its scalability, and its ability to handle sparse datasets, precisely by reducing the feature-space to a smaller, lower-dimensional latent feature-space.

But one of its major drawbacks is the lack of interpretability, because the factorization does not preserve the features that were input, nor is there an easy transformation that helps interpret the latent features. But it also means that it is possibly able to find correlations between features that we would not have thought of, which are somehow more expressive of the preferences.

The closest thing to interpreting the latent features that I have come across, is an idea called Representative Users, where the latent features are seen as users that can be used to represent all other users.

Related Question