Solved – Selecting features using Adaboost

boostingmachine learning

How does Adaboost select best features from the sample data (or a unit feature vector)?

It would be nice if someone can explain if the above statement is true or not.

I've seen the term features and classifiers being used in place of each other in documentation that's available of Adaboost (that's what I thought).

For example: www.cs.nyu.edu/~eugenew/publications/viola-facedet04-talk.pdf

Please explain!

Best Answer

Well, first of all in the presentation you mentioned they just used a value of one feature is larger/smaller than some threshold (i.e. decision tree of a depth 1) as a partial classifier, thus this feature-classifier ambiguity.

Going back to the question, there are numerous way to get feature ranking from a boosting -- starting from counting how deep in boost structure the feature lies up to some permutation tests. In the work you quoted there is in fact no feature selection -- in training they just select best feature/threshold pair, in prediction they compute features "on demand" while the prediction goes through a boost to save computational time.