Solved – Using RNN (LSTM) for predicting one feature value of a time series

I have been reading several papers, articles and blog posts about RNNs (LSTM specifically) and how we can use them to do time series prediction. In almost all examples and codes I have found, the problem is defined as finding the next x values of a time series based on previous data. What I am trying to solve is the following:

Assuming We have t values of a time series, what would be its value at time t+1?

So using different LSTM packages (deeplearning4j, keras, …) that are out there, here is what I am doing right now:

Create a LSTM network and fit it to t samples. My network has one input and one output. So as for input I will have the following patterns and I call them train data:

t_1,t_2

t_2,t_3

t_3,t_4
The next step is to use for example t_4 as input and expect t_5 as output then use t_5 as input and expect t_6 as output and so on.
When done with prediction, I use t_5,t_6 to update my model.

My question: Is this the correct way of doing it? If yes, then I have no idea what does batch_size mean and why it is useful.

Note: An alternative that comes to my mind is something similar to examples which generate a sequence of characters, one character at a time. In that case, batch_size would be a series of numbers and I am expecting the next series with the same size and the one value that I'm looking for would be the last number in that series. I am not sure which of the above mentioned approaches are correct and would really appreciate any help in this regard.

Best Answer

An RNN should also has a time dimension or sequence length for its input. In your example, the sequence length is only 1 for each training sequence so the total time step for each training is only one. That means there is basically no long-term memory that can be built on this network, and you are not using the feature of RNN (especially LSTM).

To do it correctly, you need to increase the sequence length in each training sequence (like t_1, t_2, t_3, t_4) so that the pattern can be seen in one input. You will need to decide how to cut the entire time series into several pieces with fixed length.

If you want the network to learn an indefinite length of sequence like the example you gave, you will need additional setting to make it work. In keras, you should use the stateful mode with batch_size = 1 so that the neural network can still remember the state of previous example. The shuffle needed to turn off so that the order is preserved during the training. Finally, you need to manually reset the state of the neural network model.reset_states() after each epoch of training (since in this case 1 epoch is 1 visit through the time series). See Jason Brownlee's blog for a detail example.

Best Answer

Related Solutions

Solved – Many-to-many or many-to-one LSTM when predicting a value derived from a sequence of features

Related Question