Solved – Should PCA be (always) done before Naive Bayes classification

classificationnaive bayespcapredictive-models

According to Wikipedia page on Naive Bayes:

.. Naive Bayes classifiers are a family of simple "probabilistic
classifiers" based on applying Bayes' theorem with strong (naive)
independence assumptions between the features.

Since data features may not be independent of each other, should one always perform PCA before applying Naive Bayes? PCA is expected to create components which are not much correlated with each other and hence one can expect more robust results with Naive Bayes.

Best Answer

For general cases, I don't think doing PCA first will improve the classification results for the Naive Bayes classifier. Naive Bayes assumes the features are conditional independent, which means given the class, $p(x_i|C_k)=p(x_{i}|x_{i+1}...x_n,C_k)$, this does not mean that the features have to be independent.

Moreover, I don't think PCA can improve the conditional independence in general. Using PCA without dimension reduction is just doing coordinate rotation, without taken into account the discrimination power between different class. And in most of the cases this rotation won't give uncorrelated features for each class, as shown in this following figure. enter image description here And using PCA to do dimension reduction, this might even worse the situation when the feature with discrimination power has small variance and is threw away by doing PCA first.