Solved – In Kneser Ney smoothing, how to implement the recursion in the formula

language-modelsnatural languagesmoothing

I'm working in a project trying to implement the Kneser-Key algorithm. I think I got up to the step of implementing this formula for bigrams:

$P_{(KN)}(w_i|w_{i-1}) = \frac{max(c(w_{-1}, w_{1}) –
\delta, 0)}{\sum_{w'}{c(w_{i-1}, w')}} + \lambda_{w_{i-1}}P_{continuation}(w_i) $

But this is not the final formula. The final formula includes a recursion, where you consider more than 2 gram levels. I intend to use from 4grams to unigrams in my code. So, the formula with the recursion is as follows:

enter image description here

Source: https://lagunita.stanford.edu/c4x/Engineering/CS-224N/asset/slp4.pdf

I'm not too familiar with this notation, so not sure how to add up the 4grams probability with the 3 grams, bigrams and unigrams.

Thanks a lot

Best Answer

It means that in order to compute the $k$-gram probability distribution, you first need to compute the $(k-1)$-gram probability distribution.