Classification – What Base Classifiers to Use with Bagging for Optimal Performance

baggingclassification

I am working on a specific implementation of bootstrap aggregating (bagging). I want to see how well this bagging works for different base classifiers. But so far, the decision tree seems to be the only option that I'm running into. I tried Naive Bayes but bagging seems to make things worse for some data sets. Can you suggest some more algorithms that could be used as base learner for bagging?

Best Answer

Basically you can bag any base learner. Following diversification idea, they should tend to have low bias (= overfit!) and, consequently, high variance.

Examples would include:

  • deep decision trees
  • $k$-nearest neighbour with low $k$
  • richly parametrized linear models (e.g. with splines, pairwise interactions etc.)
  • neural nets with enough complexity

Not very suitable from this perspective are

  • tree stumps
  • too simple linear models