Recurrent Neural Network – How to Reframe a Hidden Markov Model (HMM) Problem as a Recurrent Neural Network (RNN)

hidden markov modelmachine learningneural networksrecurrent neural network

Inspired by this question I have been considering how one would reframe a HMM problem as RNN problem.

For HMMs we have some observable timeseries $y(t)$ which corresponds to a set of hidden states $q(t)$. The transition probabilities between hidden states are described by a probability matrix $\bar{A}$. For a Markov process, it is possible to derive an expression for the probability that the hidden path $Q$ corresponds to the observation $O$ i.e. $P(Q|O)$. One can then use some optimisation algorithm -e.g. Viterbi to find the most probable path $Q*$ which maximises $P(Q|O)$.

But how does this formulation translate to an RNN?

With RNNs we talk in the language of training data, nodes, layers etc. and I can't see how to map this to the HMM language…

Best Answer

The hidden nodes (states) in an HMM are random variables, while in an RNN only the input nodes could be considered random variables, all the other nodes are just deterministic nonlinear functions.

Thus, it is difficult to formulate an HMM with an RNN. However, some attempts have been made to combine the ideas of dynamic Bayesian networks (DBNs), of which HMMs would be examples, and neural networks, e.g. the VRRN, alpha-nets, or GenHMM. How much those still resemble vanilla RNNs or DBNs is up for discussion.

Related Question