Solved – Greater than 1 Naive Bayes Probabilities

conditional probabilitynaive bayesprobability

I am trying to train a Naive Bayes classifier. In addition to getting the most likely class as an output from the Naive Bayes classifier, I would also like to compute the probabilities associated with labels.

I am making two assumptions: 1) conditional independence of features given the class label, and 2) independence of features. However, the math does not seem to be working out (I get greater than 1 probability for certain labels).

Let's assume we are dealing with two features ($F_1$ and $F_2$). This is the probability I want to compute:

$$P(C|F_1,F_2)$$

Where $C$ is the class. By Bayes rule:

$$P(C|F_1,F_2) = \frac{P(F_1,F_2|C)P(C)}{P(F_1,F_2)}$$

Using the independence assumptions above:

$$P(C|F_1,F_2) = \frac{P(F_1|C)P(F_2|C)P(C)}{P(F_1)P(F_2)}$$

Now, let's say we train the Naive Bayes classifier on the following data:

enter image description here

And we now want to classify a new observation $F_1=1$ and $F_2=1$.

So let's 1st compute $P(C=A|F_1=1,F_2=1)$:

$$P(C=A|F_1=1,F_2=1)=\frac{P(F_1=1|C=A)P(F_2=1|C=A)P(C=A)}{P(F_1=1)P(F_2=1)}=\frac{1*1*\frac{1}{2}}{\frac{1}{2}*\frac{1}{2}}=2$$

Clearly, I have gone wrong somewhere. However, I can't pinpoint it. Any insights would be highly appreciated!

Best Answer

$F_{1}$ and $F_{2}$ are independent given $C$. So the problem is in the denominator. Recall that $P(F_{1},F_{2}) = \sum_{C}P(F_{1},F_{2},C)$.