Solved – Hidden Markov Model vs Markov Transition Model vs State-Space Model…

hidden markov modelmachine learningself-study

For my master's thesis, I am working on developing a statistical model for the transitions between different states, defined by serological status. For now, I won't give too many details into this context, as my question is more general/theoretical. Anyway, my intuition is that I should be using a Hidden Markov Model (HMM); the trouble I am coming across as I go through the literature and other background research necessary to formulate my model is confusion over terminology and the exact differences between different types of hidden process models. I am only very vaguely aware of what distinguishes them (examples to come). Further, it seems to me that, at least from what I have seen in the literature, there is a very non-standard vocabulary built up around this type of modeling, and on occasion I see terms used interchangeably in one context but contrasted in another.

So, I was hoping people can help me disambiguate some of these terms for me. I have a number of questions, but I am guessing that as one or two get answered satisfactory the rest will become disentangled as a result. I hope this isn't too long-winded; if a moderator wants me to split this up into multiple posts I will. In any case, I've put my questions in bold, followed by the details of the question that I've uncovered during my literature search.

So, in no particular order:

1) What exactly is a "hidden process model"?

I have been operating under the impression that "hidden process model" is sort of an umbrella term that can be used to describe a number of different types of statistical models, all essentially probabilistic descriptions of time series data generated by "a system of overlapping, potentially hidden, linearly additive processes" ([1]). Indeed, [2] defines a "hidden process model" as "a general term referring to to either a state-space model or a hidden Markov model." [1] seems to infer that a hidden Markov model is a subtype of hidden process models specifically geared towards inference on binary states; the basic implication seems to me that a hidden process model is a generalization of a hidden Markov model. I sometimes see "hidden process model" AND the phrase "hidden process dynamic model", but it is not clear to me that these are distinct concepts.

Is this intuition on my part correct? If not, does anybody have a reference that more clearly delineates these methods?

2) What is the difference between a Hidden Markov Model and a state-space model?

Again returning to [2] (if only because the paper comes with a clear glossary of terms, not because the paper itself seems to be particularly authoritative; it is just a convenient source of one-sentence definitions), the difference seems to be that a Hidden Markov Model is a specific type of state-space model in which the states are Markovian (there doesn't seem to be a definite restriction on the order of the Markov process; i.e. first order,…,kth order). Here, a state-space model is defined as "A model that runs two time series in parallel, one captures the dynamic of the true states (latent) and the other consists of observations that are made from these underlying but possibly unknown states." If those states also exhibit the Markov property, then it is a Hidden Markov Model.

However, [3] defines the difference between state-space models and Hidden Markov Models as being related to the characteristics of the latent state. Here, a Hidden Markov Model deals with discrete states while state-space models deal with continuous states; otherwise, they are conceptually identical.

These seem to me to be two very different definitions. Under one, a Hidden Markov Model is a subtype of state-space model, while under the other they are both just different instantiations of a broader class of hidden process models. Which of these is correct? My intuition points me to follow [3] as opposed to [2], but I can't find an authoritative source that supports this.

3) What is a "Markov transition model"?

Another term that has come up in a lot of sources is "Markov transition model". I have not been able to find this phrase in any textbooks, but it appears a lot in journal articles (simply plug it into Google to confirm). I haven't been able to find a rigorous definition of the term (every paper I find cites another paper, which cites another, etc., sending me down a PubMed rabbit hole that leads nowhere sane). My impression from context is that it is a very general term to refer to any model in which the object of inference is the transitions between states that follow a Markov process, and that a Hidden Markov Model may be considered a specific type of Markov transition model. [4], however, seems to use transition model, Hidden Markov Model, and several similar terms interchangeably.

On the other hand, [5] talks about Markov transition models and Hidden Markov Models a bit differently. The authors state: "Transition models provide a method for summarising
respondent dynamics that are helpful for interpreting results from more complex hidden Markov models". I don't entirely understand what they mean by this phrase, and can't find a justification for it elsewhere in the paper. However, they seem to imply that Markov transition models use time as a continuous variable, while hidden Markov models use time as a discrete variable (they don't directly say this; they say they use the R package 'msm' to fit Markov transition models, and later describe 'msm' as treating time continuously in contrast to the R package for HMMs).

4) Where do other concepts, for example Dynamic Bayesian Networks, fit in?

According to Wikipedia, a Dynamic Bayesian Network is a "generalization of hidden Markov models and Kalman filters". Elsewhere, I have seen hidden Markov models defined as a special case of a Dynamic Bayesian Network, "which the entire state of the world is represented by a single hidden state variable" (Definition of dynamic Bayesian system, and its relation to HMM?). I generally understand this relationship, and it is well explained by [6].

However, I am having a hard time understanding how this relationship fits in the broader picture of things. That is, given this relationship between HMMs and DBNs, how are state-space models and hidden process models related to the two? How do all of these different types of methods interrelate, given that there seem to be multiple "generalizations" of hidden Markov models?


References:

[1] Tom M. Mitchell, Rebecca Hutchinson, Indrayana Rustandi. "Hidden Process Models". 2006. CMU-CALD-05-116. Carnegie Mellon University.

[2] Oliver Giminez, Jean-Dominique Lebreton, Jean-Michel Gaillard, Remi Choquet, Roger Pradel. "Estimating demographic parameters using hidden process dynamic models". Theoretical Population Biology. 2012. 82(4):307-316.

[3] Barbara Engelhardt. "Hidden Markov Models and State Space Models". STA561: Probabilistic machine learning. Duke University. http://www.genome.duke.edu/labs/engelhardt/courses/scribe/lec_09_25_2013.pdf

[4] Jeroen K. Vermunt. "Multilevel Latent Markov Modeling in Continuous Time with an Application to the Analysis of Ambulatory Mood Assessment Data". Social Statistics Workshop. 2012. Tilburg University. http://www.lse.ac.uk/statistics/events/SpecialEventsandConferences/LSE2013-Vermunt.pdf

[5] Ken Richardson, David Harte, Kristie Carter. "Understanding health and labour force transitions: Applying Markov models to SoFIE longitudinal data". Official Statistics Research Series. 2012.

[6] Zoubin Ghahramani. "An Introduction to Hidden Markov Models and Bayesian Networks". Journal of Pattern Recognition and Artificial Intelligence. 2001. 15(1): 9-42.

Best Answer

The following is quoted from the Scholarpedia website:

State space model (SSM) refers to a class of probabilistic graphical model (Koller and Friedman, 2009) that describes the probabilistic dependence between the latent state variable and the observed measurement. The state or the measurement can be either continuous or discrete. The term “state space” originated in 1960s in the area of control engineering (Kalman, 1960). SSM provides a general framework for analyzing deterministic and stochastic dynamical systems that are measured or observed through a stochastic process. The SSM framework has been successfully applied in engineering, statistics, computer science and economics to solve a broad range of dynamical systems problems. Other terms used to describe SSMs are hidden Markov models (HMMs) (Rabiner, 1989) and latent process models. The most well studied SSM is the Kalman filter, which defines an optimal algorithm for inferring linear Gaussian systems.

Related Question