Solved – LSTM cell output activation for series

lstmmachine learningneural networkstensorflow

This question is asked from the perspective of finding out if there's a more efficient way for an LSTM to act more as a regression entity rather than just assigning only probabilities to the next character.

In general a seq2seq LSTM char generator will act mainly as such:

C H A P T -> E
H A P T E -> R

That's great. What if I want to assign a numerical series that don't show only a probability of the next number but rather contain a regression function such as f(x) = 2x² + 3. Or maybe containing more f(x)s that describe such series i.e.

1 3 5 7 -> 9
3 5 7 9 -> 11

My question: Should I find a way to change the activation function of each LSTM cell (normally it's tanh) with another activation function that can at the same time preserves the probability of the next character and somehow preserve or "understands" that there is a regression of numbers?

Best Answer

To perform regression with a recurrent neural network (RNN), people typically add a dense layer after the RNN layer, taking the RNN hidden state as input. As a result, the traditional LSTM with tanh followed by a dense layer works: it won't only predict values between -1 and 1.

Examples: