Solved – When to use Bernoulli Naive Bayes

machine learningnaive bayespython

Below is an example of a dummy scatter plot of x,y where BLUE (0) and RED(1) are the 'target'. The yellow dot is my input and I'm asking what is the prediction that this is either BLUE (0) or RED(1)

By using Gaussian Naive Bayes (GaussianNB), I get a prediction of 0 with a probability of 99.99% (make sense)

Now, when I use Bernoulli Naive Bayes (BernoulliNB), I get a prediction of RED(1) with a probability of 0.4202. (BTW, Multinominal NB is also off: 0.57%)

Questions:

  1. When will one use Bernoulli Naive Bayes (appreciate an example) and
  2. Why in this instance Bernoulli's prediction is so off?

enter image description here

Best Answer

Bernoulli Naive Bayes is for binary features only. Similarly, multinomial naive Bayes treats features as event probabilities. Your example is given for nonbinary real-valued features $(x,y)$, which do not exclusively lie in the interval $[0,1]$, so the models do not apply to your features.

A typical example (taken from the wiki page) for either Bernoulli or multinomial NB is document classification, where the features represent the presence of a term (in the Bernoulli case) or the probability of a term (in the multinomial case).