Why Boosting Method is Sensitive to Outliers – Exploring Machine Learning Techniques

boostingcartmachine learningoutlierssvm

I found many articles that state that boosting methods are sensitive to outliers, but no article explaining why.

In my experience outliers are bad for any machine learning algorithm, but why are boosting methods singled out as particularly sensitive?

How would the following algorithms to rank in terms of sensitivity to outliers: boost-tree, random forest, neural network, SVM, and simple regression methods such as logistic regression?

Best Answer

Outliers can be bad for boosting because boosting builds each tree on previous trees' residuals/errors. Outliers will have much larger residuals than non-outliers, so gradient boosting will focus a disproportionate amount of its attention on those points.