Solved – Iris dataset and a-priori probabilities

naive bayesr

I have been playing around with two R packages for naive Bayes classification (e1071 and klaR) using the Iris dataset as an example.

During the training phase, the outpur of the apriori probabilities for each class, is 0.3333 for the three of them.

A-priori probabilities:
setosa versicolor  virginica 
0.3333333  0.3333333  0.3333333

Why is the same probability for the three classes? Does it means that if I test my model using an unknown flower, theres 33% of it being classified as setosa, versicolor or virginica?

Thanks.

Best Answer

The iris data has three sets of fifty of each class. Without doing any analysis, it should be obvious that a randomly-selected example from the iris data has a one-third chance of belonging to those classes. This is what a priori means.

Related Question