Solved – Training LSTM a sequence one item at a time

classificationlstmmachine learningneural networks

I am trying to train an lstm with a sequence and get the sequence classification for the whole sequence.

I have sequences of varying length so I have one input neuron and I am feeding one item at a time. Isn't that the proper solution?

My issue is that I am training each of these inputs with a single ideal-output, but some of the sequences exist in other sequences with other ideal-outputs.

So when I train 0.74 with 1,0 and 0.83 with 1,0 and 0.32 with 1,0 it gets trained with the class 0, 1. But when I then train with 0.74 0.83 0.32 with 0,1 the training diverges to infinity because I assigned two different classes to the same input.

How am I supposed to train a lstm with a sequence when some of the elements treat across sequences? Unless there is another way to train a deep-network that has an lstm when the sequences are of varying lengths?

Best Answer

Train it one character at a time. It shouldn't diverge unless the characters are the same and have different ideal-outputs. In that case consider using one-hot vectors instead of scalar vectors. Meaning if a, b, and c are your characters then if a is the character 1, 0, 0 is the input.

Related Question