Solved – Randomly initialized embedding matrix

lstmneural networksword embeddings

The Tensorflow LSTM example uses a word embedding layer (https://www.tensorflow.org/versions/r0.11/tutorials/recurrent/index.html). However, it mentions that 'the embedding matrix will be initialized randomly and the model will learn to differentiate the meaning of words just by looking at the data'. I think this means that pretrained embeddings (e.g. using Word2Vec) are not used. I have two questions with regards to this.

Are there any variables/weights (e.g. adjusted during back prop) associates with the word_embedding layer in this RNN? Or is it purely a static lookup table to randomly assigned variables?
Do random word embeddings have any advantages over just adding a regular hidden layer with the same dimensions as the randomly initialized word embedding layer?

Thanks

Best Answer

I think this means that pretrained embeddings (e.g. using Word2Vec) are not used.

Correct

Are there any variables/weights (e.g. adjusted during back prop) associated with the word_embedding layer in this RNN? Or is it purely a static lookup table to randomly assigned variables?

First option: there are any variables/weights (e.g. adjusted during back prop) associated with the word_embedding layer in this RNN.

Do random word embeddings have any advantages over just adding a regular hidden layer with the same dimensions as the randomly initialized word embedding layer?

It is the equivalent: adding a random word embedding layer means adding a regular hidden layer.

Best Answer

Related Solutions

Solved – Which part of the hidden layer architecture do pretrained word embeddings come from

Solved – Difference between non-contextual and contextual word embeddings

Related Question