I am working on an 8 state (8th state absorbing) multi-state markov model, and what I am having difficulty understanding is, how sensitive are the results to the initial qmatrix and what is the best way to go about creating the initial matrix?
For example, in http://leg.ufpr.br/~silvia/papers/mmm.pdf, the matrix is:
-(q12+q14) q12 0 q14
q21 -(q21+q23+q24) q23 q24
.......
But when the Q matrix is created immediately below using the following code
Q <- rbind ( c(0, 0.25, 0, 0.25),
+ c(0.166, 0, 0.166, 0.166),
+ c(0, 0.25, 0, 0.25),
+ c(0, 0, 0, 0) )
there are no negative values. In fact, for Q[1,1], the value is zero, not the negative of the sums of q12+q14. Also, the fourth row, which is the absorbing state, is entered as all zeros. Why wouldn't Q[4,4] be 1?
How would I go about creating that initial matrix, given the frequency of times the subjects are observed moving between states? (For another example, please see the state table provided below.)
On a more general note, if someone could recommend a good primer for understanding and using Markov models (skewed towards a general audience), and interpreting their results, I would be grateful.
Here is a the state table for my data (in R):
dput(mm2myTrajectoryStateTable)
structure(c(13295L, 946L, 126L, 6L, 1L, 1L, 0L, 3461L, 15035L,
937L, 100L, 19L, 17L, 2L, 636L, 2014L, 3977L, 146L, 4L, 16L,
4L, 46L, 153L, 518L, 1141L, 0L, 10L, 19L, 84L, 109L, 16L, 3L,
257L, 25L, 0L, 18L, 117L, 62L, 13L, 72L, 446L, 11L, 0L, 4L, 12L,
31L, 1L, 28L, 139L, 12909L, 15089L, 4377L, 680L, 146L, 193L,
38L), .Dim = 7:8, .Dimnames = structure(list(from = c("1", "2",
"3", "4", "5", "6", "7"), to = c("1", "2", "3", "4", "5", "6",
"7", "8")), .Names = c("from", "to")), class = "table")
Best Answer
Synopsis
The construction of the transition matrix $P(t)$ from $Q$ and $t$ is explained (with a simple $2\times 2$ example) at Solving the Kolmogorov forward equation for transition probabilities. Obtaining $Q$ from $P=P(t)$ is effected by reversing the process.
Methods
To summarize, that thread shows that $P(t) = \exp(tQ)$ (the matrix exponential) can be computed by diagonalizing $Q$ and exponentiating the diagonal terms (all multiplied by the time $t$). Thus, $Q$ is a matrix logarithm which can be computed by diagonalizing $P$ and taking the logs of its diagonal terms.
Specifically, upon finding a matrix $V$ for which
$$P(t) = V\ \Lambda\ V^{-1}$$
with $\Lambda$ having the values $\lambda_1, \lambda_2, \ldots, \lambda_n$ on the diagonal (for $n$ states, including all absorbing states) and zeros elsewhere, compute that
$$Q = V\ \frac{1}{t}\log(\Lambda)\ V^{-1}$$
where $\log(\Lambda)$ has the values $\log(\lambda_1), \log(\lambda_2), \ldots, \log(\lambda_n)$ on its diagonal and zeros elsewhere.
There is a technical complication: it can require matrices with complex coefficients to achieve the diagonalization. However, after the calculation is complete, all the results will be real numbers. Dealing with this will, in practice, merely require taking the real part of the answer (and perhaps checking that the imaginary parts are all very small in comparison, just as a double-check).
The matrix of the question has this $8\times 8$ matrix for its infinitesimal generator.
As expected, the rows sum to zero. I have rounded the values to three decimal places, colored the backgrounds according to the rounded values, and desaturated those colors to reflect the magnitudes of the entries (so that grayish entries are small). The patterns clearly stand out.
Assessing uncertainty
When $P$ is estimated from data, its entries--and therefore the entries of $Q$--experience sampling variation. Perhaps the most straightforward way to assess that uncertainty is through simulation (a parametric bootstrap). For the matrix $P$ given in the question here are simple summaries of the $7\times 8$ deviations between $9999$ bootstrapped values of $Q$ and the computed value of $Q$:
The full $56$-variate distribution can be estimated from the simulation data if desired.
Code
The following
R
code performs the calculations and simulations. The essential part is performed by the functionq.matrix
, which computes the matrix logarithm of its argument (presumed to be a square matrix) after row-normalizing it to sum to unity along the rows. The key lines obtain the eigenvectors and values, compute the logarithms of the eigenvalues, and reassemble the result into $Q$:(Edit November 2019:
clog
is a complex logarithm; for positive real $x,$ $\operatorname{clog}(-x) = \log(x) + \pi i$. It is needed in order to handle negative eigenvalues.)The full code follows. The simulation requires about six seconds. A simulation with just $999$ iterations would produce helpful results and take less than one second.