On the relationship between linear algebra and Markov chains

linear algebramarkov chainsprobabilityprobability theorystochastic-processes

I am currently taking an introductory module on Markov chains, which does not cover any linear algebra aspects. However, I have taken an introductory module on linear algebra in my first year of college as well and I am trying to draw some links between linear algebra and Markov chains.

I recently thought of two concepts relating to stationary distributions and steady-state distributions.

On stationary distributions

I have learnt that there can only be three possible situations with regards to stationary distributions – either there is no stationary distribution, a unique stationary distribution or infinitely many stationary distributions. I immediately recalled a similar idea in linear algebra, where I learnt that a system of linear equations can only have one of three possibilities – either there is no solution, a unique solution or infinitely many solutions. Thus, I am wondering whether the following statement always holds: There exists a unique stationary distribution if and only if the one-step transition matrix is invertible.

On steady-state distributions

I know that one way to find a steady-state distribution is to find the $n^{th}$ power of the one-step transition matrix and let $n$ tend to infinity. Thus, I am also wondering whether the following statement always holds: There exists a steady-state distribution if and only if the one-step transition matrix is diagonalisable i.e. it can be written in the form $A = PDP^{-1}$.

I am thinking that the backward direction should hold i.e. if the one-step transition matrix is diagonalisable, then there exists a steady-state distribution. However, I am not sure about the other direction.


I have discussed these with my professor, but he says he does not seem to know, off-hand, of any such theorems in general, although he does admit that there are many links between Markov chains and linear algebra. Thus, I thought it best to bring my queries here, where there are so many brilliant minds 🙂 Note also that my understanding of Markov chains and linear algebra are both rather brief, so my thoughts above may not be very sophisticated.

I would very much appreciate if anyone confirm (or deny) the above two "links" I have mentioned. Moreover, if anyone has any other intuitive/well-known/widely used linear algebra theorems regarding Markov chains, do pen them down!

Best Answer

Neither of the implications in you first hypothesis holds. A Markov chain with transition matrix $$ \pmatrix{\frac{1}{4}&\frac{3}{4}\\ \frac{1}{4}&\frac{3}{4}} $$ has the unique stationary distribution $\ \big(\frac{1}{2},\frac{1}{2}\big)\ $, but the transition matrix is not invertible. The Markov chain with transition matrix $$ \pmatrix{1&0\\0&1} $$ has an infinite number of stationary distributions (any $2$-component probability vector is a stationary distribution), but the transition matrix is invertible.

Your basic idea is sound, but you're looking at the wrong matrix. If $\ P\ $ is the ($\ n\times n\ $) transition matrix of an $\ n$-state Markov chain, and $\ \mathbb{P}\ $ is the $\ (n+1)\times(n+1)\ $ matrix defined by $$ \mathbb{P}=\pmatrix{P-I_{n\times n}&\mathbb{1}_n\\\mathbb{1}_n^T&0}\ , $$ then an $\ n\times1\ $ probability vector $\ \mu\ $ is a stationary distribution of the chain if and only if it satisfies the equations $$ \pmatrix{\mu^T&1}\mathbb{P}=\pmatrix{\mathbb{1}_n^T&1}\ , $$ so the stationary distribution of the chain is unique if and only if the matrix $\ \mathbb{P}\ $ is invertible.

The backward implication of your second hypothesis is trivially true for homogeneous finite-state Markov chains because every such chain has a stationary distribution. The forward implication does not hold, however, because there exist finite-state Markov chains whose transition matrices are not diagonalisable.

Addendum

I should also have mentioned the commonly given criterion for a homogeneous finite-state Markov chain to have a unique stationary distribution—namely that it will do so if and only if it is irreducible and aperiodic. This will be true if and only if every positive integral power of its transition matrix is irreducible.

Related Question