Solved – Small Probabilities in Naive Bayes

classificationmachine learningnaive bayes

I am trying to implement Naive Bayes, but I am encountering a problem. I have 5000 word features. Hence, every sample is a binary vector of length 5000. The true labels are 1 or 0. The value of P(feature=1 | label=1) and P(feature=0 | label=1) are very small (~0.03) as the feature vector is very sparse. When I calculate the numerator i.e.

P(features | label=1) * P(label=1)

since, the probability values are very small and because of the conditional independence assumption of Naive Bayes, when I multiply 2000 such small terms, I get 0 and hence, a wrong result. What should be done?

Best Answer

The two most commonly used techniques to prevent underflows with a naive Bayes classifier are:

Working in the log space
Using the log-sum-exp trick

More details: Example of how the log-sum-exp trick works in Naive Bayes

FYI:

With the Naive Bayes classifier, why do we have to normalize the probabilities after calculating the probabilities of each hypothesis?
Softmax Regression Large Inner Product Float Overflow

Best Answer

Related Solutions

Solved – Greater than 1 Naive Bayes Probabilities

Solved – Most Informative Features with Naive Bayes

Related Question