Solved – Inverse tSNE is feasible

dimensionality reductiontsneword2vec

Short question: is it meaningful to use tSNE ( http://homepage.tudelft.nl/19j49/t-SNE.html) to modify existing high-dimensional data using similarities in some low-dimensional vectors?
In essence, that means applying tSNE in reverse direction ( but with constraint, that we are already given high-dimensional vectors and just wanted to "deform" them using the information in low-dimensional vectors ).

Long question:
Here is my task : I have a list of word embeddings ( in word2vec sense ), and these vectors all are 64-dimensional.

On another hand, for each word a have another vector ( 15-dimensional ), and mapping is one-to-one ( i.e. each word have 64-dimensional embedding and 15-dimensional signature ). 15-dimensional vectors contain important additional information about words, which is impossible to take into account directly.

And I wanted to make these 64-dimensional vectors more similar by taking into account similarity between corresponding 15-dimensional vectors.

Any advices in this direction are highly appreciated.
Thanks in advance!

Best Answer

You could append the 15 dimensional vector to the 64 dimensional vector, obtaining a 79 dimensional vector. You can then reduce the dimensionality by projecting on eigensubspaces.

In general however, there are infinitely many ways to do this, so you need to come up with a way some sort of objective criterion for evaluating the quality of your embedding.

word2vec attempts to maximize the predictive ability of the embedding in skip-grams. What you could do is re-train word2vec, but using the 15 dimensional vectors as a side information in the network to predict the missing word.

Related Question