Solved – Converting 2nd order Markov chain to the 1st order equivalent

markov chainself-study

Given a 2nd order Markov chain where each state takes values in the set $\mathcal{X}=\{A,C,G,T\}$, such that all transition probabilities $p(x_t|x_{t-1},x_{t-2})$ are larger than zero,

How to convert it to the equivalent 1st order one with all the transition probabilities defined?

Best Answer

Here's a way to do it:

(I may be writing my state vectors and transition matrices transposed relative to the way you might have learned them, or even the way they're usually done. If that's the case you'll need to translate back.)

The probability model gives you probabilities for 4 output states at time $t$ in terms of the 16 input states - the possible ordered pairs for $(x_{t-1},x_{t-2})$.

For speed of writing, let's write $AC$ for $(A,C)$ and so on.

\begin{array}{c|cccc|cccc|c} & AA & AC &AT&AG& CA &CC &CT &CG& \ldots \\ \hline A &p_{AA\to A}&p_{AC\to A}&p_{AT\to A}&p_{AG\to A}&p_{CA\to A}&p_{CC\to A}&p_{CT\to A}&p_{CG\to A}\\ C &p_{AA\to C}&p_{AC\to C}&p_{AT\to C}&p_{AG\to C}&p_{CA\to C}&p_{CC\to C}&p_{CT\to C}&p_{CG\to C}\\ T &p_{AA\to T}&p_{AC\to T}&p_{AT\to T}&p_{AG\to T}&p_{CA\to T}&p_{CC\to T}&p_{CT\to T}&p_{CG\to T}\\ G &p_{AA\to G}&p_{AC\to G}&p_{AT\to G}&p_{AG\to G}&p_{CA\to G}&p_{CC\to G}&p_{CT\to G}&p_{CG\to G}\\ \hline \end{array}

We could label the partitions with boldface versions of the state $x_{t-1}$:

\begin{array}{c|cccc|cc|c|cc} & AA & AC &AT&AG& CA & &\ldots && \ldots& GG \\ \hline A & & & & & & & & & & \\ C & & \mathbf{A} & & & \quad\mathbf{C} & &\mathbf{T}& & \mathbf{G} & \\ T & & & & & & & & & & \\ G & & & & & & & & & & \\ \hline \end{array}

As I mentioned in comments, you need to extend the state. Let $z_t$ consist of pairs of states $(x_t,x_{t-1})$ and now consider a Markov Chain in $z_t$; that is, you have a transition matrix $p(z_t|z_{t-1})$.

So the state at time $t$ will be one of the 16 pairs $(A,A), (A,C) \ldots (G,G)$, and the transition matrix will be a 16 $\times$ 16 matrix of transition probabilities that will be mostly zero (necessarily so, because any pair that doesn't have the second component of $z_t$ match with the first component of $z_{t-1}$ is impossible).

As above, for speed of writing, let's also write $AC$ for $(A,C)$ and so on.

For ease of display I am going to define $z_{t-1}^*$ which is simply a permuted $z_{t-1}$. We can write $p(z_t|z_{t-1}^*)$ and then arrive back at $p(z_t|z_{t-1})$ by simple permutation.

So the transition matrix for $p(z_t|z_{t-1}^*)$ is of the form:

\begin{array}{c|cccc|cc|c|cc} & AA & AC &AT&AG& CA & &\ldots && \ldots& GG \\ \hline AA & & & & & & & & & & \\ CA & & \mathbf{A} & & & \quad\mathbf{0} & &\mathbf{0}& & \mathbf{0} & \\ TA & & & & & & & & & & \\ GA & & & & & & & & & & \\ \hline AC & & & & & & & & & & \\ \vdots & & \mathbf{0} & & & \quad\mathbf{C} & &\mathbf{0}& &\mathbf{0} & \\ \vdots & & & & & & & & & & \\\hline \vdots & & \mathbf{0} & & &\quad\mathbf{0} & &\mathbf{T}& & \mathbf{0} &\\ \vdots & & & & & & & & & & \\ \hline \vdots & & \mathbf{0} & & &\quad\mathbf{0} & &\mathbf{0}& & \mathbf{G} & \\ GG & & & & & & & & & & \\ \hline \end{array}

We can then rearrange either the rows or columns so they're in the same order; the transition matrix no longer has that simple structure, but contains the same values.

Generally, you can use this procedure to transform any $k$-th order Markov chain to a first-order MC (also holds for Hidden Markov Models).