Solved – Naive Bayes logarithmic probability

logarithmnaive bayes

I am trying to do sentiment analysis using Naive Bayes and have a doubt regarding log.
While calculating posterior probability in Naive Bayes classifier, we apply log to prevent underflows and very small values.
My question is that when applying log, $\log(p(x|Y=C)$, do we have to apply log separately to numerator and denominator like

$$
\frac{\log(\text{count of word in class }C)}{\log(\text{total words in class }C)}
$$

or apply log to the result of division like

$$
\log\left(\frac{\text{count of word in class }C}{\text{total words in class }C}\right)?
$$

Best Answer

The second form:

$$ \log\left(\frac{\text{count of word in class }C}{\text{total words in class }C}\right) $$

does not prevent you from underflow issues since you are still doing the same calculation and afterwards transform it into log scale.

Your first equation:

$$ \frac{\log(\text{count of word in class }C)}{\log(\text{total words in class }C)} $$

on another hand, is incorrect.

Recall that the basic properties of logs are:

$$ \begin{align} & \log_b(xy)=\log_b(x)+\log_b(y) \\ & \log_b(\tfrac{x}{y})=\log_b(x)-\log_b(y)\\ & \log_b(x^d)=d\log_b(x) \\ & \frac{\log_d(x)}{\log_d(y)} = \log_y x \end{align} $$

so the correct form should be

$$ \log(\text{count of word in class }C) - \log(\text{total words in class }C) $$

There is even more of interesting properties, and you can read about them e.g. in the Wikipedia article List of logarithmic identities.

Best Answer

Related Solutions

Solved – Naive Bayes feature probabilities: should I double count words

Solved – Naive Bayes classifier gives a probability greater than 1

Related Question