What's the difference, if any at all, between max_depth
and max_leaf_nodes
in sklearn's RandomForestClassifier for a simple binary classification problem?
If the model always grows trees in a symetric fashion, one would assume setting max_depth
= 5 is equivalent to setting max_leaf_nodes
= 32.
The fact that sklearn gives us 2 options suggests that might not be the case.
Best Answer
As @whuber points out in a comment, a 32-leaf tree may have depth larger than 5 (up to 32). To answer your followup question, yes, when
max_leaf_nodes
is set,sklearn
builds the tree in a best-first fashion rather than a depth-first fashion.From the docs (emphasis added):
and in the source code: