Solved – Scaling the backward variable in HMM Baum-Welch

beta distributionhidden markov modelmachine learning

I am just trying to implement the scaled Baum-Welch algorithm and I have run into a problem where my backward variables, after scaling, are over the value of 1. Is this normal? After all, probabilities shouldn't be over 1.

I am using the scale factor I obtained from the forward variables:

$$
c_t = 1 / \sum_{s\in S}\alpha_t(s)\\
$$
where c_t is the scaling factor for time t, alpha is the forward variable, s are the states in the hmm.

For the backward algorithm I implemented it in java below:

public double[][] backwardAlgo(){
        int time = eSequence.size();
        double beta[][] = new double[2][time];

        // Intialize beta for current time
        for(int i = 0; i < 2; i++){
            beta[i][time-1] = scaler[time-1];
        }

        // Use recursive method to calculate beta
        double tempBeta = 0;
        for(int t = time-2; t >= 0; t--){
            for(int i = 0; i < 2; i++){
                for(int j = 0; j < 2; j++){
                    tempBeta = tempBeta + (stateTransitionMatrix[i][j] * emissionMatrix[j][eSequence.get(t+1)] * beta[j][t+1]);
                }
                beta[i][t] = tempBeta;
                beta[i][t] = scaler[t] * beta[i][t];
                tempBeta = 0;
            }
        }
        return beta;
    }

The scales are stored in the array called scaler. There are 2 states in this hmm.
I should also note that the scale factors I am getting are over 1 as well.

Best Answer

I don't this in itself indicates any problem. $\sum_{s \in S} \alpha_t(s)$ is the probability that the observed output sequence up to $t$ was $t_0, t_1, \dots,$ eSequence.get(t). Thus, it's fine for $c_t$ to be greater than one. Also, for instance for $\beta_{\mathrm{time}-1}$, it's $\sum_{s \in S}\beta_{\mathrm{time}-1}(s) = |S|c_{\mathrm{t}-1}$, which can very well be over one.