Bayesian – How to Use Kernel Density Estimate in Naive Bayes Classifier?

bayesiankde

This question is a follow up to my earlier question here
and is also related, in intent, to this question.

On this wiki page probability density values from an assumed normal distribution for the training set are used to calculate a Bayesian posterior rather than actual probability values. However, if a training set is not normally distributed would it be equally as valid to use a density value taken from the kernel density estimate of the training set to calculate a Bayesian posterior?

In its intended application this kernel density estimate would be taken from a theoretically ideal empirical data set generated by MC techniques.

Best Answer

I have read both the first linked earlier question, especially the answer of whuber and the comments on this.

The answer is yes, you can do that, i.e. using the density from a kde of a numeric variable as conditional probability ($P(X=x|C=c)$ in the bayes theorem. $P(C=c|X=x)=P(C=c)*P(X=x|C=c)/P(X=x)$

By assuming that d(height) is equal across all classes, d(height) is normalized out when the theorem is applied, i.e. when $P(X=x|C=c)$ is divided by $P(X=x)$.

This paper could be interesting for you: estimating continuous distributions in bayesian classifiers

Related Question