Gradient tree boosting as proposed by Friedman uses decision trees as base learners. I'm wondering if we should make the base decision tree as complex as possible (fully grown) or simpler? Is there any explanation for the choice?
Random Forest is another ensemble method using decision trees as base learners.
Based on my understanding, we generally use the almost fully grown decision trees in each iteration. Am I right?
Best Answer
$\text{error = bias + variance}$
Please note that unlike Boosting (which is sequential), RF grows trees in parallel. The term
iterative
that you used is thus inappropriate.