Solved – Does it make sense to multiply two embedding vectors

deep learningmachine learningword embeddingsword2vec

Many researchers are using neural network to infer embedding vectors for words, users, or items. Word embeddings, e.g., word2vec, allow people to calculate sum, average, and difference over embeddings.

So does it make sense to multiply two embeddings? For instance, one 200-d user embedding and one 200-d movies embedding. The miltiplication results in a new 200-d vector, which should be able to represent the interaction of the user and the movie. The new vector can be an input of any prediction model. Does it make sense?

Best Answer

Yes it does. Here you can find example of network that uses multiplication, among other methods, for combining embeddings. As described in my answer

element-wise product $u*v$, is basically an interaction term, this can catch similarities between values (big * big = bigger; small * small = smaller), or the discrepancies (negative * positive = negative) (see example here).

So it is perfectly reasonable way of combining weights, but often, as in above example, people use in parallel several different methods for combining them, to produce different kind of features for the model.