Markov Process – Estimating Transition Probability to the Same State Using MSM Package

markov-processrtransition matrix

For a given panel dataset I have used the MSM package to estimate transition probabilities between states. Using the pnext.msm function I can obtain the probabilities of getting from a given state $r$ to a state $s$. However, I am also interested in the transition probability of going to $r$ from $r$. The help page for pnext.msm states that

For a continuous-time Markov process in state r, the probability that the next state is s is $-q_{rs} / q_{rr}$, where $q_{rs}$ is the transition intensity (qmatrix.msm).

This seems not well-defined when $r = s$.

I know that I can obtain the transition probabilities for a given time window using the pmatrix.msm function, but then the question is what time window I should pick so that I can extract the right $r \rightarrow r$ transition probabilities.

How can I extract the right probability of going from any state $r$ to itself?

Best Answer

Short Answer

Use the Kolmogorow-Smirnov test to identify a time window such that the eigenvalues of the Transition matrix is not sensitive to the precise choice of window.

Longer Answer

In Markov theory, there are two ways of formulating the dynamics.

Continuous time: We talk about transition rates in units of inverse time. So $Q_{rs}$ is the number of transitions from $r$ to $s$ expected per unit time (e.g. 5 transitions per second).
Discrete time. Here we talk about the probability of transitioning from one state to another in some specified interval of time. So $T_{rs}(\delta t)$ would be the probability to be in state $s$ at time $t+\delta t$ given that we were in state $r$ at time $t$.

One can show that these two formulations are related through a matrix exponential.

\begin{equation} T = e^{Q\delta t} \end{equation}

Continuous Time Being Well Defined

So regarding your first point regarding self transition being well-defined:

The continuous time formulation requires that the diagonal elements $Q_{rr}$ are equal to the negative sum of all the other transitions ($-\sum_{i\neq r}Q_{ri}$). You can read more about this here.
The pnext.msm quantity gives the probability of which state will come next. This is achieved by the equation in your question which just calculates the relative probabilities of transitions.
Notice that since the diagonal is the negative sum, the equation becomes: \begin{equation} \frac{-Q_{rs}}{Q_{rr}}=\frac{-Q_{rs}}{-\sum_{i\neq r}Q_{ri}}=\frac{Q_{rs}}{\sum_{i\neq r}Q_{ri}} \end{equation}
So now it is clearly just a relative likelihood!
Since the continuous time approach asks "Which state comes next?", the answer to that question can never be the current state.

Discrete Time Choice of Time Window

Now, let's say we want to look at discrete time and talk about "Which state will we be in 1 second from now?". Then we can easily find our $T$ from $Q$ if we know what our $\delta t$ is.

How do we choose the time window? Well we want a window such that the process is memoryless (Markovian). Often real world processes are non-Markovian on short time scales.

To check for Markovianity, we compute the matrix at a range of windows and examine that the dominant eigenvalue $\lambda$ is insensitive to the precise choice of window.

\begin{equation} \frac{log(\lambda(\delta t))}{\delta t}\approx \frac{log(\lambda(2\delta t))}{2\delta t} \end{equation}

Related Solutions

Solved – Estimating Standard Errors for Markov Transition Probability with Multiple Observations (in R)

Before you start thinking about the standard errors of your estimates, it would be good to know, or to check, if your Markov chain has a stationary distribution.

There is no way to say anything from the small sample you provided as it seems to be just randomly shuffled pairs of states, rather than the actual history of the process (the pair on the next line doesn't start in the state where the previous line ended). Going forward I will presume that the chain has a stationary distribution.

Regarding the bootstrap for the state pairs, it seems you are trying to obtain a confidence interval for your maximum likelihood estimates of the transition probabilities. The bootstrap can be used for this, but I would consider bootstrapping each row of the transition matrix at a time - basically simulating a multinomial random variable. (see also this paper)

A row-wise approach also guarantees that you will not obtain invalid matrices from your bootstrap.

I think you could also be able to calculate the 95% confidence intervals for your estimates directly (for example using the MultinomialCI package):

> library(MultinomialCI)
> t1 <- c("A","A","B","C","B","C","E","E","A","C","B","A","C","C")
> t2 <- c("B","C","C","D","C","D","A","E","D","B","A","C","D","B")
> 
> transition_probs <- data.frame(t1,t2)
> 
> # illustration for departure state A
> est_trans_prob_A <- table(transition_probs$t2[transition_probs$t1=="A"])/
+                     length(transition_probs$t2[transition_probs$t1=="A"])
> ci <- multinomialCI(table(transition_probs$t2[transition_probs$t1=="A"]),
+                     alpha=0.05)
> 
> result <- data.frame(est_p = est_trans_prob_A, lo_ci=ci[,1], up_ci=ci[,2])
> result
  est_p.Var1 est_p.Freq lo_ci     up_ci
1          A       0.00  0.00 0.6083073
2          B       0.25  0.00 0.8583073
3          C       0.50  0.25 1.0000000
4          D       0.25  0.00 0.8583073
5          E       0.00  0.00 0.6083073

On the other hand, in case you have some previous knowledge about the transition probabilities, you might prefer the CoinMinD package which allows to specify a Dirichlet prior and run a Bayesian analysis using MCMC.

Solved – Significance testing for Markov chain transition probabilities

I had the same problem, you can take into account the uncertainty of estimating $P(S_t=B)$ from the data by doing a two sample $\chi^2$ test for the hypothesis test. Thus instead of $P(S_t=B)$, you use it as a sample (number times being in B and the total number of events in you chain)