I am trying to classify a small dataset (around 500 records) into two classes. I used various methods like SVM, Naive Bayes and k-nn classifier. Now, I would like to set the results from one of the classifiers are my baseline and perform a statistical hypothesis testing. I am not much familiar with this field of statistical testing, and I wonder how to proceed on this.
I have been thinking of setting the SVM classifier as my baseline, but I am not sure how to perform a t-test (or similar) on the data. The input dataset has 10 attributes. Should I use the classification results from two classifiers and do a paired t-test on them? For example, I could take the result from Naive Bayes and perform the paired t-test with the SVM results (which is the baseline). Is this the right approach?
Also, I am confused with the explanation for null and alternate hypothesis. Could someone be willing to give an idea about how to fix the null and alternate hypothesis.
Best Answer
In general layman's terms (and not just for this problem),
For your classifier performance comparison problem, I recommend reading Chapter 6 of Japkowicz & Shah, which goes into detail on how to use significance testing to assess the performance of different classifiers. (Other chapters give more background on classifier comparison - sounds like they might interest you too.)
In your case,
The book goes into far more detail, so I do recommend reading that chapter.
And in terms of baselines, the tests I've mentioned don't distinguish between a baseline and a non-baseline. This is a good thing, as it gives you flexibility to decide which comparisons you should give more importance to in your analysis. The number of tests you actually do determines whether you should rely on 1. or 2. above.