Solved – Top five classifiers to try first

classificationmachine learningmethodology

Besides obvious classifier characteristics like

computational cost,
expected data types of features/labels and
suitability for certain sizes and dimensions of data sets,

what are the top five (or 10, 20?) classifiers to try first on a new data set one does not know much about yet (e.g. semantics and correlation of individual features)? Usually I try Naive Bayes, Nearest Neighbor, Decision Tree and SVM – though I have no good reason for this selection other than I know them and mostly understand how they work.

I guess one should choose classifiers which cover the most important general classification approaches. Which selection would you recommend, according to that criterion or for any other reason?

UPDATE: An alternative formulation for this question could be: "Which general approaches to classification exist and which specific methods cover the most important/popular/promising ones?"

Best Answer

Random Forest

Fast, robust, good accuracy, in most cases nothing to tune, requires no normalization, immune to collinearity, generates quite good error approximation and useful importance ranking as a side effect of training, trivially parallel, predicts in a blink of an eye.

Drawbacks: slower than trivial methods like kNN or NB, works best with equal classes, worse accuracy than SVM for problems desperately requiring kernel trick, is a hard black-box, does not make coffee.

Related Solutions

Solved – Classification model for movie rating prediction

Hein,

there are a lot of tools and libs with the functionality available.

Which to choose depends whether you would like to use a gui for your work or if you would like to embed it in some other program.

Standalone Data mining tools (there are ohters like WEKA with Java interface):

Rapid Miner
Orange
Rattle gui for R
KNIME

Text based:

GNU R

Libs:

Scikit for Python
Mahout on Hadoop

If you know a programming language well enough I would use a lib for that language or give R a try. If not you may try one of the tools with gui.

A tree example in R:

# we are using the iris dataset
data(iris)

# for our tree based model we use the rpart package
# to download it type install.packages("rpart")
library(rpart)

# Building the tree
fit <- rpart(Species ~ Petal.Length + Petal.Width, method="class", data=iris)

# Plot the tree
plot(fit)
text(fit)

As suggested the analysis with R requires you to code yourself, but you will find a package for most classification tasks which will work out of the box. An overview can be found here Machine Learning Task View

To get started with RapidMinder you should have a look at Youtube. There are some screencasts, even for decision trees.

Naive Bayesian Classifiers – Why They Perform So Well in Classification

This paper seems to prove (I can't follow the math) that bayes is good not only when features are independent, but also when dependencies of features from each other are similar between features:

In this paper, we propose a novel explanation on the superb classiﬁcation performance of naive Bayes. We show that, essentially, the dependence distribution; i.e., how the local dependence of a node distributes in each class, evenly or unevenly, and how the local dependencies of all nodes work together, consistently (supporting a certain classiﬁcation) or inconsistently (canceling each other out), plays a crucial role. Therefore, no matter how strong the dependences among attributes are, naive Bayes can still be optimal if the dependences distribute evenly in classes, or if the dependences cancel each other out

Best Answer

Related Solutions

Solved – Classification model for movie rating prediction

Naive Bayesian Classifiers – Why They Perform So Well in Classification

Related Question