Solved – the difference between a Hidden Markov Model and a Mixture Markov Model

hidden markov modelmarkov-process

From what I understand, Hidden Markov Models are those that relate observable and unobservable states, whilst Mixture Markov Models are techniques to cluster sequences according to which Markov model out of a set of them is able to approximate it better.

However, I am having trouble understanding boundaries between both terms. Can I define a Hidden Markov Chain without following the Mixture approach, and conversely, can I have a Mixture Model without considering hidden states?

Thanks for your help!

Best Answer

I am not familiar with what you call mixture Markov models. However, as I say further in this answer, some people call hidden Markov models, dynamical mixture model. It is possible that other people refer to Mixture Markov model then. I would be happy if you could indicate where you have read this term.

The previous answer to this question states things that are somewhat inaccurate about the relationship between mixture models and HMMs. What follows aims at clarifying this.

Mixtures models are simply a weighted sum of probability distributions, nothing more:

$P(X|\theta) = \sum_{i=1}^M w_i p_i(X|\theta_i)$, with $\sum_{i=1}^M w_i = 1$

$M$ is the number of components in the mixture. A random variable can follow a mixture model the same way it follows a probability distribution.

Hidden Markov models (HMM) are far more sophisticated models and mixture models can be a part of these models.

A HMM is based on a Markov Chain of states (said hidden states). It does not model a random variable but a time series (an ordered sequence of values, multidimensional or not). Each state of the Markov chain is associated with a probability distribution that can also be a mixture of distribution (here is the only point where the 2 concepts connect!)

At time $t=0$ a state is drawn from an initial probability mass function. The observation at time $t_0$ is assumed to follow (or to have been generated) by the probability distribution (or mixture depending on the case) associated to this drawn state. At time $t=1$, the system enters a new hidden state that is drawn from a transition matrix and the same procedure is repeated.

Hidden Markov models are sometimes called Dynamical Mixture Models (as in this technical report) because, when states are associated to mixtures of probability distributions, they can somehow be seen as a mixture model that dynamically changes over time. However, these changes are modeled via a transition matrix and are fully part of what a HMM is.

Opposite to what is said in the other answer, an HMM can be without mixture with any number of states (and even an infinite number of states). An HMM with a unique state is basically a mixture model, and if that mixture model has only 1 component, it is a simple probability distribution.