There are three nice Bayes classifier techniques: Bernoulli, Multinomial, and Gaussian
If we have a dataset whose samples have continuous-valued features, then Gaussian Bayes Classifier is used.
In this point, I am a little bit confused. Some people told me that I do not check the features are normally distributed in the classes to be able to use Gaussian Bayes Classifier. I can directly use it.
I think that this is not true, since when we build classifier discriminant functions, we benefit from Naive Bayes probability theorem, and we assume that probability functions are probability density function of normal distribution. If features of our samples are not normally distributed, we cannot use gaussian density function directly.
Which one is true ?
Do we have to check if the samples are normally distributed or not before using Gaussian Bayes Classifier ?
They are partly correct. You do not have to check the distributions of the features (variables) you're interested in using. Mostly because the Naive Bayes framework does not assume a distribution on the features themselves. It's just convenient to use a Normal distribution.
That aside, if you feel your data does not fit the density of a Normal distribution, this may have a performance hit. But not always. Naive Bayes, although very simple, it performs very well on complex problems with complex data sets. The only thing that Naive Bayes assumes is that all features/variables are independent, and even this is a loose restriction.
This is an investigation that you will have to take on for your own specific problem.
Compare initial results with unscaled data vs scaled data and see if performance/inference differs.
You may also benefit from reading a previous post, Use of kernel density estimate in Naive Bayes Classifier?