Solved – One class classification with LIBSVM in Weka

libsvmmachine learningsvmweka

I have a dataset on a particular domain and I want to do a one-class classification with LIBSVM (wrapper) in Weka. I have trained the classifier, but the problem is, when I test it with a different dataset than the test set, I get all of them as correctly classified (which I know they are topically different from the training data so no way of correctly classifying). I experimented changing the gamma and the $\nu$ parameter but I can't make a reliable model.

What could be the reason for this?

Best Answer

I would not use SVM's for one-class-classification problems (especially for text), due to their "magic parameters" that are only hard to be interpreted (e.g., the well-known $C$ parameter problem).

There are other (much simpler) alternative methods that can also do the job, which can be implemented easily from scratch. Some of them are mentioned in the paper "One-class document classification via Neural Networks", which i highly recommend to read.

There is also another paper that proposes an authorship verification method (an instance of one-class-classification problems) that deals with cross-topic texts. The learned model in this paper seems to be relieable enough to handle not only cross-topic but also cross-genre cases, even for different languages. Perhaps this could help...