Random forests select a subset of the features at each node and only considers those features as candidate splits. Wikipedia says this is sometimes called feature bagging.
Does XGBoost also use this technique, in its tree learning, when growing an individual tree? Or does it consider all possible features when examining possible splits?
I read the original paper on XGBoost [1], and couldn't tell from what was in that paper. Section 2.3 of that paper mentions "column sub-sampling" , which appears to be the same thing, and it mentions columns sub-sampling is used in random forests, but doesn't explicitly indicate whether it is used in XGBoost.
References:
[1] XGBoost: A Scalable Tree Boosting System. Tianqi Chen, Carlos Guestrin. KDD'16, arXiv:1603.02754.
Best Answer
See documentaiton here, and search
colsample_bytree
.By default, it uses all features. But you can set
colsample_bytree
parameter to get a subset on features.