I have been reading several papers, articles and blog posts about RNNs (LSTM specifically) and how we can use them to do time series prediction. In almost all examples and codes I have found, the problem is defined as finding the next x
values of a time series based on previous data. What I am trying to solve is the following:
- Assuming We have
t
values of a time series, what would be its value at timet+1
?
So using different LSTM packages (deeplearning4j, keras, …) that are out there, here is what I am doing right now:
-
Create a LSTM network and fit it to
t
samples. My network has one input and one output. So as for input I will have the following patterns and I call them train data:t_1,t_2
t_2,t_3
t_3,t_4
-
The next step is to use for example
t_4
as input and expectt_5
as output then uset_5
as input and expectt_6
as output and so on. -
When done with prediction, I use
t_5,t_6
to update my model.
My question: Is this the correct way of doing it? If yes, then I have no idea what does batch_size
mean and why it is useful.
Note: An alternative that comes to my mind is something similar to examples which generate a sequence of characters, one character at a time. In that case, batch_size
would be a series of numbers and I am expecting the next series with the same size and the one value that I'm looking for would be the last number in that series. I am not sure which of the above mentioned approaches are correct and would really appreciate any help in this regard.
Best Answer
An RNN should also has a time dimension or sequence length for its input. In your example, the sequence length is only 1 for each training sequence so the total time step for each training is only one. That means there is basically no long-term memory that can be built on this network, and you are not using the feature of RNN (especially LSTM).
To do it correctly, you need to increase the sequence length in each training sequence (like
t_1, t_2, t_3, t_4
) so that the pattern can be seen in one input. You will need to decide how to cut the entire time series into several pieces with fixed length.If you want the network to learn an indefinite length of sequence like the example you gave, you will need additional setting to make it work. In keras, you should use the
stateful
mode withbatch_size = 1
so that the neural network can still remember the state of previous example. Theshuffle
needed to turn off so that the order is preserved during the training. Finally, you need to manually reset the state of the neural networkmodel.reset_states()
after each epoch of training (since in this case 1 epoch is 1 visit through the time series). See Jason Brownlee's blog for a detail example.