RNNs – Understanding Transition Functions

hidden markov modelmachine learningmarkov-processrecurrent neural networktransition matrix

From what I understand, the hidden states of RNNs are equivalent to the deterministic probability distribution over hidden states in for example a Hidden Markov Model.

Thus, just as probabilistic models such as Markov chains or HMM, have state transition probabilities, which forms the transition matrix, does there exist a similar state transition function in recurrent neural networks?

Best Answer

The hidden state of an RNN is updated by

$h_t = Wh_{t-1} + Ux_t$

Where $W, U$ are parameters of the RNN, and $x_t$ is the $t$'th input.

There's a difference between the hidden state in a markov chain, and the hidden state in an RNN though -- a markov chain is a stochastic process, whereas an RNN is a series of computational steps which can be carried out either deterministically or stochastically.

For example: if $x_t$ are some fixed inputs, and the output is $y_t = V h_t$, then the output of the RNN is completely determined, there is nothing random going on here.

On the other hand, if we take this output $y_t$, sample, and feed it back into the RNN: $x_{t+1} \sim \mathcal{N}(y_t, \Sigma)$, then we've constructed a markov process.