Solved – Structure of Recurrent Neural Network (LSTM, GRU)

lstmneural networks

I am trying to understand the architecture of RNNs. I have found this tutorial which has been very helpful: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Especially this image: enter image description here

How does this fit into a feed-forward network? Is this image just another node in each layer?

Best Answer

A is, in fact, a full layer. The output of the layer is $h_t$, is in fact the neuron output, that can be plugged into a softmax layer (if you want a classification for the time step $t$, for instance) or anything else such as another LSTM layer if you want to go deeper. The input of this layer is what sets it apart from the regular feedforward network: it takes both the input $x_t$ and the full state of the network in the previous time step (both $h_{t-1}$ and the other variables from the LSTM cell).

Note that $h_t$ is a vector. So, if you want to make an analogy with a regular feedforward network with 1 hidden layer, then A could be thought as taking the place of all of these neurons in the hidden layer (plus the extra complexity of the recurring part).

Related Question