Solved – MLE for Naive Bayes in R

e1071machine learningnaive bayesr

I am using the naivebayes function of the e1071 library. Some example commands are:

model = naiveBayes(Species ~ ., data = iris)
pred = predict(model, iris[,])

My question is: how can I obtain the maximum likelihood estimate for the conditional probability distribution of this model?

Best Answer

This seems a bit ambiguous... What's wrong with model$table?

Related Solutions

Solved – Spam filtering using naive Bayesian classifiers with the e1071/klaR package on R

The NaiveBayes() function in the klaR package obeys the classical formula R interface whereby you express your outcome as a function of its predictors, e.g. spam ~ x1+x2+x3. If your data are stored in a data.frame, you can input all predictors in the rhs of the formula using dot notation: spam ~ ., data=df means "spam as a function of all other variables present in the data.frame called df."

Here is a toy example, using the spam dataset discussed in the Elements of Statistical Learning (Hastie et al., Springer 2009, 2nd ed.), available on-line. This really is to get you started with the use of the R function, not the methodological aspects for using NB classifier.

data(spam, package="ElemStatLearn")
library(klaR)

# set up a training sample
train.ind <- sample(1:nrow(spam), ceiling(nrow(spam)*2/3), replace=FALSE)

# apply NB classifier
nb.res <- NaiveBayes(spam ~ ., data=spam[train.ind,])

# show the results
opar <- par(mfrow=c(2,4))
plot(nb.res)
par(opar)

# predict on holdout units
nb.pred <- predict(nb.res, spam[-train.ind,])

# raw accuracy
confusion.mat <- table(nb.pred$class, spam[-train.ind,"spam"])
sum(diag(confusion.mat))/sum(confusion.mat)

A recommended add-on package for such ML task is the caret package. It offers a lot of useful tools for preprocessing data, handling training/test samples, running different classifiers on the same data, and summarizing the results. It is available from CRAN and has a lot of vignettes that describe common tasks.

Classification – How is Naive Bayes Algorithm a Linear Classifier

In general the naive Bayes classifier is not linear, but if the likelihood factors $p(x_i \mid c)$ are from exponential families, the naive Bayes classifier corresponds to a linear classifier in a particular feature space. Here is how to see this.

You can write any naive Bayes classifier as*

$$p(c = 1 \mid \mathbf{x}) = \sigma\left( \sum_i \log \frac{p(x_i \mid c = 1)}{p(x_i \mid c = 0)} + \log \frac{p(c = 1)}{p(c = 0)} \right),$$

where $\sigma$ is the logistic function. If $p(x_i \mid c)$ is from an exponential family, we can write it as

$$p(x_i \mid c) = h_i(x_i)\exp\left(\mathbf{u}_{ic}^\top \phi_i(x_i) - A_i(\mathbf{u}_{ic})\right),$$

and hence

$$p(c = 1 \mid \mathbf{x}) = \sigma\left( \sum_i \mathbf{w}_i^\top \phi_i(x_i) + b \right),$$

where

\begin{align} \mathbf{w}_i &= \mathbf{u}_{i1} - \mathbf{u}_{i0}, \\ b &= \log \frac{p(c = 1)}{p(c = 0)} - \sum_i \left( A_i(\mathbf{u}_{i1}) - A_i(\mathbf{u}_{i0}) \right). \end{align}

Note that this is similar to logistic regression – a linear classifier – in the feature space defined by the $\phi_i$. For more than two classes, we analogously get multinomial logistic (or softmax) regression.

If $p(x_i \mid c)$ is Gaussian, then $\phi_i(x_i) = (x_i, x_i^2)$ and we should have \begin{align} w_{i1} &= \sigma_1^{-2}\mu_1 - \sigma_0^{-2}\mu_0, \\ w_{i2} &= 2\sigma_0^{-2} - 2\sigma_1^{-2}, \\ b_i &= \log \sigma_0 - \log \sigma_1, \end{align}

assuming $p(c = 1) = p(c = 0) = \frac{1}{2}$.

*Here is how to derive this result:

\begin{align} p(c = 1 \mid \mathbf{x}) &= \frac{p(\mathbf{x} \mid c = 1) p(c = 1)}{p(\mathbf{x} \mid c = 1) p(c = 1) + p(\mathbf{x} \mid c = 0) p(c = 0)} \\ &= \frac{1}{1 + \frac{p(\mathbf{x} \mid c = 0) p(c = 0)}{p(\mathbf{x} \mid c = 1) p(c = 1)}} \\ &= \frac{1}{1 + \exp\left( -\log\frac{p(\mathbf{x} \mid c = 1) p(c = 1)}{p(\mathbf{x} \mid c = 0) p(c = 0)} \right)} \\ &= \sigma\left( \sum_i \log \frac{p(x_i \mid c = 1)}{p(x_i \mid c = 0)} + \log \frac{p(c = 1)}{p(c = 0)} \right) \end{align}

Best Answer

Related Solutions

Solved – Spam filtering using naive Bayesian classifiers with the e1071/klaR package on R

Classification – How is Naive Bayes Algorithm a Linear Classifier

Related Question