Solved – Why use LSTM or in general RNN instead of plain NN

deep learninglstmmachine learningneural networksrecurrent neural network

I understand that in LSTM, there are 3 gates to help keep the memory. But why do we use LSTM instead of NN in the first place?
For example, in a LSTM, my features are just 1-D, time steps is set to be 60, which means the longest memory my LSTM can remember is just 60 days ago, you can't use any information 73 days ago to predict today's target.

But in a plain NN, we can reconstruct the 1-D variable into a 60-D variables with the other dimensions being the variable's value 1-day ago, 2-day ago, …, 60-day ago. And we predict today's target variable.

What is the difference here? We are all just using 60-day information to predict a single variable.

Best Answer

A couple of reasons:

First, you might not know how long your sequence will be ahead of time. RNNs let you use a variable length input.

Second, RNNs are a good way to do parameter sharing between different time steps. Instead of having to learn how to extract useful information from each time step separately you can reuse the same parameter matrix recurrently.

Best Answer

Related Solutions

RNN – Using LSTM for Predicting Time Series Vectors in Theano

Solved – what is the intuition behind separate activation/memory paths in LSTM’s

Related Question