This question is asked from the perspective of finding out if there's a more efficient way for an LSTM to act more as a regression entity rather than just assigning only probabilities to the next character.
In general a seq2seq LSTM char generator will act mainly as such:
C H A P T -> E
H A P T E -> R
That's great. What if I want to assign a numerical series that don't show only a probability of the next number but rather contain a regression function such as f(x) = 2x² + 3
. Or maybe containing more f(x)
s that describe such series i.e.
1 3 5 7 -> 9
3 5 7 9 -> 11
My question: Should I find a way to change the activation function of each LSTM cell (normally it's tanh
) with another activation function that can at the same time preserves the probability of the next character and somehow preserve or "understands" that there is a regression of numbers?
Best Answer
To perform regression with a recurrent neural network (RNN), people typically add a dense layer after the RNN layer, taking the RNN hidden state as input. As a result, the traditional LSTM with tanh followed by a dense layer works: it won't only predict values between -1 and 1.
Examples: