Solved – How to get feature importance for Gaussian Naive Bayes classifier

classificationfeature selectionimportancemachine learningnaive bayes

I have a dataset consisting of 4 classes and around 200 features. I have implemented a Gaussian Naive Bayes classifier. I want now calculate the importance of each feature for each pair of classes according to the Gaussian Naive Bayes classifier. In the end, I want to visualize the 10 most important features for each pair of classes. That means for class 1 vs class 2, I want the importance of feature 1, feature 2, etc.

My proposal would be to calculate $\log\frac{P(X_n|Y=i)}{P(X_n|Y=j)}$ for each pair of classes.

I have calculated the mean and variance for each feature and each class. That means I have a mean and variance for each of the 200 features and each of the 4 classes. Taking the normal distribution I can classify a new data point.

How would you now calculate the feature importance measure?

Let's say we calculate $\log\frac{P(X_1|Y=1)}{P(X_1|Y=2)}$, that means the first feature ($X_1$) and class 1 and class 2, so it should give the importance of the first feature for class 1 and class 2.

I have both the normal distribution for the first feature for class 1 and class 2 but how should I calculate the probability, i.e., at which point should I evaluate the normal distribution?

Best Answer

The discriminative value of a feature is based on its statistical distance between classes.

I have calculated the mean and variance for each feature and each class

Using your feature $i$ class $j$ estimated mean $\hat{\mu}_{i,j}$ and estimated variance $\hat{\sigma}_{i,j}^2$, one approach would be to compute the symmetric KL divergence for each feature for two classes you compare. The largest distance between feature distributions is the best discriminative feature for that pair.

KL divergence for two normal distributions is easy to compute.