Pretty straightforward question. Does it check every feature and pick the one that maximizes some split-quality metric (like a decision tree), or does it just select a feature at random (like a random forest)?
Boosting – How Does XGBoost Select Which Feature to Split On?
boostingcartrandom forest
Best Answer
I believe that the xgboost paper is a little underprecise on this point but I will share my interpretation of what they describe. This understanding is not based on reading the code so take it with a grain of salt and hope for a braver soul to put a finer point on this anwer (which I found because I also had it after reading the paper).
The lead-in to the discussion of the approximate algorithm for choosing a split point says
and then goes on to say that this exhaustive search is unfeasible for large datasets and also does not interact well with distributed training so that an approximate algorithm is needed. This algorithm is described in some detail in the appendix and it is (I believe) specifically an algorithm for finding candidate split points for a given feature.
Their description of how this approximate algorithm is used says
I believe that this means that candidate splits are proposed for every single feature in the dataset and that all of these candidate splits are inspected to find the optimal one. This means that the method of finding splits really is an approximate form of the algorithm that inspects every single feature and every single split to find the best one but that the approximation only happens at the level of the splits proposed for each feature. Every single feature is still inspected.