Solved – AdaBoost – Best Weak Learner with 0.5 Error

adaboost

In AdaBoost, the weight of a weak learner $\alpha$ is set as

$\alpha_t = \frac{1}{2}ln\frac{1-e_t}{e_t}$

under the assumptions that

$e_t = \frac{1}{2} – \gamma$ and $\gamma > 0$

Therefore:

$\alpha_t = 0$ for $e_t \geq 0.5$

So what happens in the case of the XOR problem and AdaBoost with Decision Stumps?

No weak learner can achieve an error rate better (i.e. lower) than $0.5$ in the first round, hence it should be $\alpha_t = 0$ for all $t$, making AdaBoost (with decision stumps) fail to solve the XOR problem.

However, it seems odd that a highly versatile classifier such as AdaBoost fails to solve this rather simple problem due to the fact that this assumption is violated in the first round of boosting.

Does AdaBoost with decision stumps really fail it? If yes, what are extensions to overcome this problem? I have seen images of AdaBoost correctly solving the XOR problem, however, I believe that these were modified forms of AdaBoost.

Best Answer

To develop the point made by Andreas:

No weak learner can achieve an error rate better (i.e. lower) than 0.5 in the first round, hence it should be αt=0 for all t, making AdaBoost (with decision stumps) fail to solve the XOR problem.

Indeed in this particular (degenerate) case the AdaBoost algorithm using the following settings will fail:

  • Estimator = decision tree
  • Max_depth = 1
  • Max_leaf_nodes 2

The point is that Adaboost is not limited to Decision trees, and can be used together with many other weak estimators such as Logistic Regression Support Vector Machines etc..

In general, AdaBoost is often used with Decision Trees, in which case the tree parameters are adjusted to the problem at hand. You could, in this example, increase the maximum depth to 2 in which case the issue disappears.

In Sklearn, the default estimator is a Decision tree with unlimited max depth and 2 leaf_nodes