Solved – Prediction using Naive Bayes of klaR package fails

machine learningnaive bayesr

I am trying to replicate a example that I found in Tom Mitchell's book Machine Learning (1997), using R. It is a example from chapter 6.

There are 14 training examples (shown below) of the target concept PlayTennis, where each day is described by the attributes Outlook, Temperature, Humidity, and Windy.

Training examples:

Outlook,Temperature,Humidity,Windy,Play
overcast,cool,normal,true,yes
overcast,hot,high,false,yes
overcast,hot,normal,false,yes
overcast,mild,high,true,yes
rainy,cool,normal,false,yes
rainy,mild,high,false,yes
rainy,mild,normal,false,yes
sunny,cool,normal,false,yes
sunny,mild,normal,true,yes
rainy,cool,normal,true,no
rainy,mild,high,true,no
sunny,hot,high,false,no
sunny,hot,high,true,no
sunny,mild,high,false,no

Here's my code:

library("klaR")
library("caret")

data = read.csv("example.csv")

x = data[,-5]
y = data$Play

model = train(x,y,'nb',trControl=trainControl(method='cv',number=10))

Outlook <- "sunny"
Temperature <- "cool"
Humidity <- "high"
Windy <- "true"

instance <- data.frame(Outlook,Temperature,Humidity,Windy)

predict(model$finalModel,instance)

The example tries to predict the outcome for

Outlook=sunny, Temperature=cool,Humidity=high and Wind=strong

The problem is that I am getting a different prediction from the one in the book.

Here are the probabilities I've got from my code:

no          yes
0.001078835 0.9989212

Here are the book's probabilities:

no     yes
0.0206 0.0053

My code classifies the unseen data as Yes and the book's classifier classifies it as No.

Shouldn't both give the same answer since we are using the same naive Bayes classifier?

EDIT:

I replicated the example using scikit-learn MultinomialNB classifier and I have got the following probabilities

no    yes
0.769  0.231

which are similar to the normalized probabilities of the book.

Normalized probabilities of the book

no     yes
0.795  0.205

Best Answer

The problem is small enough you can work it out by hand. For your example you have $$ \begin{align*} P(outlook = sunny| play=yes) &= \frac{2}{9}\\ P(temp = cool| play=yes) &= \frac{3}{9}\\ P(humidity=high| play=yes) &= \frac{3}{9}\\ P(windy=true| play=yes) &= \frac{3}{9}\\ P(play=yes) &= \frac{9}{14}.\\ \end{align*} $$ Putting it all together you have $$ \begin{align*} P(play=yes|sunny, cool, high, true) &\varpropto \frac{2}{9} \left(\frac{3}{9}\right)^3 \frac{9}{4}\\ &\approx 0.0053, \end{align*} $$ which agrees with Mitchell. I don't use R, so I can't speak as to why the output is different. Obviously the package you're using is normalizing, but this shouldn't change the classification. If I had to guess I'd say it is the cross validation.

Related Solutions

Solved – Spam filtering using naive Bayesian classifiers with the e1071/klaR package on R

The NaiveBayes() function in the klaR package obeys the classical formula R interface whereby you express your outcome as a function of its predictors, e.g. spam ~ x1+x2+x3. If your data are stored in a data.frame, you can input all predictors in the rhs of the formula using dot notation: spam ~ ., data=df means "spam as a function of all other variables present in the data.frame called df."

Here is a toy example, using the spam dataset discussed in the Elements of Statistical Learning (Hastie et al., Springer 2009, 2nd ed.), available on-line. This really is to get you started with the use of the R function, not the methodological aspects for using NB classifier.

data(spam, package="ElemStatLearn")
library(klaR)

# set up a training sample
train.ind <- sample(1:nrow(spam), ceiling(nrow(spam)*2/3), replace=FALSE)

# apply NB classifier
nb.res <- NaiveBayes(spam ~ ., data=spam[train.ind,])

# show the results
opar <- par(mfrow=c(2,4))
plot(nb.res)
par(opar)

# predict on holdout units
nb.pred <- predict(nb.res, spam[-train.ind,])

# raw accuracy
confusion.mat <- table(nb.pred$class, spam[-train.ind,"spam"])
sum(diag(confusion.mat))/sum(confusion.mat)

A recommended add-on package for such ML task is the caret package. It offers a lot of useful tools for preprocessing data, handling training/test samples, running different classifiers on the same data, and summarizing the results. It is available from CRAN and has a lot of vignettes that describe common tasks.

Solved – Naive Bayes fails with a perfect predictor

Note that dat$X in your code is a numeric variable. The NaiveBayes implementation in klaR for numeric predictor variables calculates the mean and standard deviations of the predictor variable at each outcome level. Rather than dealing with standard deviations of 0, the klaR authors decided to throw an error.

If you change dat$X to a factor, it will create classification tables you are likely expecting. Alternatively, the naiveBayes function in the e1071 package will return distributions with a standard deviation of 0 if you prefer that over throwing errors, or you can delete the stop(...) code towards the end of klaR:::NaiveBayes.default (though that might cause problems with prediction and plotting functions in klaR).

Best Answer

Related Solutions

Solved – Spam filtering using naive Bayesian classifiers with the e1071/klaR package on R

Solved – Naive Bayes fails with a perfect predictor

Related Question