[Math] Expected amount of time until Markov chain pattern appears

markov chains

Gary can be Cheerful $(1)$, So-so $(2)$, or glum $(3)$. The following transition matrix describes the Markov chain that records Gary's mood changes.

$$P=\begin{bmatrix}0.5 & 0.4 & 0.1\\0.3 & 0.4 & 0.3\\0.2 & 0.3 & 0.5\end{bmatrix}$$
With stationary distribution vector
$$ \pi = (0.33871,0.370968,0.290323)$$
If Gary is cheerful today, what is the expected amount of days until he is glum three days in a row?

This problem comes from Sheldon Ross' Intro to Probability Models, 10th edition, problem 51 of chapter 4. In chapter 4, Ross details how to obtain the expected amount of time after $X_0=r$ until a pattern $(i_1, …, i_k)$ appears. The stated formula for a pattern with no overlaps is
$$ \mathbb {E}[N(i_1, …, i_k)|X_0=r]=\mu(r,i_1)-\mu(i_k,i_1)+\frac {1}{\Pi_{i_1}T_{i_1,i_2}\cdots T_{i_{k-1},i_k}}$$
Where $\Pi$ is the stationary distribution vector, $T$ is the transition probability matrix for that chain, $N(i_1, …, i_k)=\text{min}\{n \ge k : X_{n-k+1}=i_1,…,X_n = i_k\}$ and $\mu (x,i_1)$ is the mean number of transitions for the chain to enter state $i_1$ given it is in state $x$.

Ross also explains how to obtain an expression for a pattern that has overlaps, i.e. if for $j<k$ then $(i_{k-j+1},…,i_k)=(i_1,…,i_j)$. His strategy consists of breaking down the pattern into its overlapping parts and repeating until we can use the first formula.

However, for Gary's mood swings, this is not enough because the pattern I am looking for is the same everywhere, so that I cannot apply the first formula and I cannot break the pattern down. The idea I had was to compute, via conditioning
$$\mathbb{E}[N(x,3,3,3)|X_0=1] =\mathbb{E}[N(x,3,3,3)|X_0=1,x=1]\mathbb{P}(x=1)+\mathbb{E}[N(x,3,3,3)|X_0=1,x=2]\mathbb{P}(x=2)$$
Now, if the above is valid, then I have two non-overlapping patterns and I can apply the formula given in the book, but to calculate $\mathbb{P}(x)$ I would still have some trouble, although I think that $\mathbb{P}(x=i)=\pi_i$ is justified.

Main questions

  1. Is my approach correct? If not, how can I solve this problem?
  2. Is $\mathbb{P}(x=i)=\pi_i$ really justified? What does the vector $\pi$ really represent?

Best Answer

Something about your proposed solution doesn’t feel right, but I can’t quite put my finger on it yet. However, taking $\Pr(x=1)=\pi_1$, the first term comes out to be about 38.5, which is already greater than the correct value. The solution I present here is in the same spirit as the one offered by Did in that it solves the problem by expanding it a bit, except that this one will use a general result for absorbing Markov chains, which can be used for any such first-pattern-match problem.

Expand the state space to include all of the stages of completion of the sought-after pattern. In this case, there are two additional states: “has been glum two days in a row” and “has been glum three days in a row.” Since, strictly speaking, Markov processes don’t terminate, we model having matched the pattern by making the last state absorbing: from this state, the system can only transition to the same state. The other states are transient: the system can only return to each of them a finite number of times before entering an absorbing state. The problem then becomes one of determining the expected number of steps until absorption. The transition matrix for this augmented Markov process is easily constructed from $P$. It is $$P'=\begin{bmatrix} 0.5&0.4&0.1&0&0 \\ 0.3&0.4&0.3&0&0 \\ 0.2&0.3&0&0.5&0 \\ 0.2&0.3&0&0&0.5 \\ 0&0&0&0&1 \end{bmatrix}=\left[\begin{array}{c|c}Q & R \\ \hline \mathbf 0^T&I\end{array}\right].$$ The $4\times4$ submatrix $Q$ involves only the transient states, with $[Q^k]_{ij}$ giving the $k$-step probability of the system’s being in state $j$ given that the system started in state $i$. The expected number of times that the system is in state $j$ before being absorbed given that it started in state $i$ is captured by the fundamental matrix $$N=\sum_{k=0}^\infty Q^k=(I-Q)^{-1}.$$ The expected number of steps until being absorbed after starting in state $i$ is then given by the vector $\mathbf t=N\mathbf 1$, where $\mathbf 1$ is a vector of all $1$s, i.e., by the vector of row sums of $N$. For our matrix $P'$, we compute $$N\approx\begin{bmatrix}10.3333 & 9.88889 & 4.0 & 2.0 \\ 8.66667 & 10.4444 & 4.0 & 2.0 \\ 7.0 & 7.66667 & 4.0 & 2.0 \\ 4.66667 & 5.11111 & 2.0 & 2.0\end{bmatrix}$$ and $$\mathbf t\approx\begin{bmatrix}26.2222 \\ 25.1111 \\ 20.6667 \\ 13.7778\end{bmatrix}.$$ Gary starts out cheerful, so we want $\mathbf t_1\approx26.2222$, which agrees with the value computed from Did’s answer. In fact, this computation is Did’s answer in a different guise: We can rewrite that system of equations as $$\begin{bmatrix} 1-p_{CC} & -p_{CS} & -p_{CG} & 0 \\ -p_{SC} & 1-p_{SS} & -p_{SG} & 0 \\ -p_{GC} & -p_{GS} & 1 & -p_{GG} \\ -p_{GC} & -p_{GS} & 0 & 1 \end{bmatrix}\begin{bmatrix}t_C\\t_S\\t_G\\t_{GG}\end{bmatrix}=\begin{bmatrix}1\\1\\1\\1\end{bmatrix},$$ but this is $(I-Q)[t_C,t_S,t_G,t_{GG}]^T=\mathbf 1$.

This method can be applied to any pattern: add states for each step of the pattern, including an absorbing state for the pattern having been matched, then compute the fundamental matrix and the appropriate row sum(s). If the pattern is self-overlapping, then some of these new states will have transitions to other new states instead of back to the original states as we had above. Constructing an appropriate transition graph for the match lies at the heart of the KMP string-search algorithm (in particular, the construction of the partial match table) and is very similar to constructing a DFA that accepts a language specified by a regular expression. Once you’ve computed the fundamental matrix $N$ for an absorbing Markov chain, there are other things besides expected absorption times that can be computed with it. See the Wikipedia article on absorbing Markov chains for a gloss.

Related Question