Solved – Problem in understanding hidden Markov model EM training and log-lik computation result

classificationhidden markov modelMATLAB

i'm using Kevin Murphy's HMM Toolbox.
But i have a problem in understanding the results.

My problem is to classify some sequences with Hidden Markov Model.
I'm using a (test) dataset build with random sequences of symbol '1' '3' '6'.
this one is my training dataset:
data = [ 1 1 1 3 6;
1 1 3 3 6;
1 3 3 6 6] –
3 observations of 5 symbols.

i use EM baum welch to train and log_prob to evaluate this new observation:
data1 = [1 3 6]

i'm expecting logprob near 0 like -0.1 instead i have thoose result:

loglik = -110.04
loglik = -110.63
loglik = -91.679

why have i those weird result??

this is my MATLAB code:

O = 10;
Q = 12;

% training data
T = 5;
nex = 3;
data = [ 1 1 1 3 6;
         1 1 3 3 6;
         1 3 3 6 6]

% initial guess of parameters
prior1 = normalise(rand(Q,1));
transmat1 = mk_stochastic(rand(Q,Q));
obsmat1 = mk_stochastic(rand(Q,O));

% improve guess of parameters using EM
[LL, prior2, transmat2, obsmat2] = dhmm_em(data, prior1, transmat1, obsmat1, 'max_iter', 50);
LL

% use model to compute log likelihood
data1 = [1 3 6]
loglik = dhmm_logprob(data1, prior2, transmat2, obsmat2)

EDIT: also giving the same data of testing i recive weird result! why that?

Best Answer

Log likelihood is used as a mathematical convenience (it allows you to turn multiplication into addition). This is especially useful when working with very small numbers (like in the case of evaluating and training hidden markov models) in order to avoid underflow problems.

Log likelihood in the case of a HMM (Hidden Markov Model) I think is obtained in the following way

$$ log P(sequences|model) = \sum\limits_{i=1}^n log P(x^i | model) $$

where $x^j$ is the j-th sequence in your training set.

HMM are never guaranteed to converge to a global optimum so they will usually get stuck during training in a local optimum (which incidentally will not explain your data perfectly).

In order to actually visualize and see how these low values are created for the log likelihood you can try and do the computations by hand for obtaining $P(sequences|model)$. In the case of Hidden Markov Models you can do this with the Forward algorithm which is pretty simple and straightforward to use (you can find it explained in numerous locations on the web but if you have issues you can come back here and I'll point you in the right direction).

One last thing. I see in your code that for the training step you use a maximum iteration parameter to stop the training. An alternative would be to use the log likelihood to stop training: stop training when the difference in log likelihood between the current model and the one in the previous iteration is under a certain threshold. If you can try this instead of the maximum iteration it would also be more illustrative on the fact that the model converges to a local optimum (you will see that the log likelihood will increase slowly from iteration to iteration but in smaller increments).