Classification – How to Feed an LSTM Network with a Mini Batch and When to Reset LSTM State

classificationlstmneural networksrecurrent neural network

My classification problem is the following: I have a sequence of features. These are used to predict one of 200 classes. I'm trying to use RNNs(more specific LSTMs).

In each learning iteration my Framework processes a mini-batch (B feature-sequences with the length N). There, each feature from the sequence is fed into the network, resulting in N loop iterations with the B features fed at each iteration.

The actual question is about the basic learning process in LSTMs, so when should I reset the state of the LSTMs? Do I have to do it at every iteration, so for each mini-batch? Or is the reset done once before the actual training?

My first thoughts about this are the following: if I reset the state at each learning iteration, then the LSTM calculates the new state based on the B feature-sequences, which are not necessary from the same class. Would it be better for the training to have samples (feature-sequences) from the same class in one mini batch?

EDIT: After some discussion with my colleagues and some investigation of the framework I am using (chainer), I have found out some things. First, as you said, the state should be reset every minibatch. Other frameworks, like tensorflow, do this reset automatically, before each pass of the net. The second part of my question was actually about, whether the state in a LSTM is shared across all samples in the minibatch. The answer is NO. In chainer, the LSTM saves the B(for each sequence one) states. Finally, after rethinking my question, the answers are actually very obvious, but when you are new to a certain thing, everything is unclear.

Best Answer

when should I reset the state of the LSTMs?

Typically, for each new input, i.e. for each sample.

how to feed the Network with a mini batch?

Typically, samples are padded so that all samples in a mini batch have the same length, for programming and performance reasons.