Solved – Markov chain model likelihood ratio test

degrees of freedomlikelihood-ratiomarkov-process

Suppose I am using two Markov Chain Models, one with order $k=1$ and a second one with order $k=2$. I am "reducing" the higher order model to a $k=1$ model in order to have easier calculation possibilities.

I train each model on the same data and also calculate the log likelihoods on the same data. Now I want to determine the log likelihood ratio test in order to make a model selection, as they are nested.

To do so I need the LRT (which is straight forward) and the degrees of freedom. Currently, I am determining the df by calculating the difference between the parameters of the $k=2$ ($m^2(m-1)$) and the null model $k=1$ ($m(m-1)$).

The problem now is that the degrees of freedom are very, very high and so I come up with a high p value all the time, which says that I should stick with my null model. I am unsure, if this is the right way to do so. The second order model is much sparser, so do I really need to calculate the worst case number of parameters, or can I make any limitations to that?

Maybe someone can help me out with that.
Cheers!

Best Answer

I see no problem with your resolution: if the number of states is m, there are $m\times(m-1)$ free parameters when $k=1$ and $m\times m\times(m-1)$ free parameters when $k=2$. Unless your data is strongly dependent upon the two past states, the likelihood ratio test will favour $k=1$.

If you want to reduce the number of parameters for $k=2$, you have to do it "by hand", i.e. by introducing restrictions on those $m\times m\times(m-1)$ free parameters... Or use a variable length Markov chain.