Solved – Selecting number of time lags for input in LSTM networks

lstmrecurrent neural networktime series

I know from theory that LSTM are meant to selectively capture long and short term dependencies in a sequence. I'm trying to implement LSTMs for a time series task and I notice that a lot of tutorials on the web make use of the target sequence lagged by 1 as an input, without including further back observations (at each time step).

What I don't get is whether in this way the distinctive properties of LSTMs as described above are exploited properly. Can LSTM capture long-term dependencies if only fed with the last observation in time? Is this automatically done through the internal state of the LSTM or do I need to feed the network with a window of past time lags into which I want it to capture long/short-term dependencies?

Best Answer

As you have mentioned, RNN's and LSTM's are meant to capture time dependency in time-series data. Thus, feeding in an input with only one time-step does not make sense. (Unless one is using Stateful LSTM which has a different story).

Here is an example: We have a product and we want to forecast its sales from historical data. We can then choose number of time steps based on which we want to make a prediction, for instance, given 7 days of sales, predict the sales of the 8th days. Thus, the input would be of shape (N, ts, 1) and output would be of shape (N, 1). N is the total number of samples you have (so each sample has 7 days of sales as input and the sales of the next day as output).

I am not sure which tutorials you are referring to, but this one might have a better example.