Solved – define prior probabilities in naive bayes with unbalanced classes and asymetric cost

naive bayespriorr

I'm trying to apply Naive bayes to the following supervised problem:

  • It's a binary classification problem
  • The classes are unbalanced. The target class represents the 0.004266432 of the total and the mayoritary class the 0.995733568.
  • There is also an unbalanced cost scheme given by the following formula:

    Profit = 5000 * TP – 100 * FP

    TP: True Positive – FP: False Positive

    The objective is to maximize the Profit function.

I'm using the klaR package in R to fit the model, so it's posible to adjust the priors.

Questions:

1) Is it posible using the prior probabilities to improve the model taking in consideration the asymetric cost scheme or/and the class inbalance?

2) The predict() function outputs a class prediction and a probability. The problem is that the probabilities of the minoritary class are too small. Is it posible to use the priors or scale the probabilities in a clever way to get a better cut off point?

So far, the results I get using Bayes are half as good compared to other methods (random forest, lasso). So I'm pretty sure there is a way to improve the naive bayes approach.

Best Answer

The NaiveBayes function in klaR already computes class priors from the proportions in the training set. If the numbers you have given are computed from the training set, then there is nothing to gain by specifying priors to the function. The cost scheme is irrelevant when fitting a naïve Bayes model.

The poor performance of naïve Bayes is probably due to the independence assumption that it makes. To get better results within the principled framework that naïve Bayes is based on, you need to switch to a model that makes more appropriate assumptions.