The Tensorflow LSTM example uses a word embedding layer (https://www.tensorflow.org/versions/r0.11/tutorials/recurrent/index.html). However, it mentions that 'the embedding matrix will be initialized randomly and the model will learn to differentiate the meaning of words just by looking at the data'. I think this means that pretrained embeddings (e.g. using Word2Vec) are not used. I have two questions with regards to this.
-
Are there any variables/weights (e.g. adjusted during back prop) associates with the word_embedding layer in this RNN? Or is it purely a static lookup table to randomly assigned variables?
-
Do random word embeddings have any advantages over just adding a regular hidden layer with the same dimensions as the randomly initialized word embedding layer?
Thanks
Best Answer
Correct
First option: there are any variables/weights (e.g. adjusted during back prop) associated with the word_embedding layer in this RNN.
It is the equivalent: adding a random word embedding layer means adding a regular hidden layer.