Solved – How to find optimal values for the tuning parameters in boosting trees

boostingcomputational-statisticscross-validationmachine learning

I realise that there are 3 tuning parameters in the boosting trees model, i.e.

  1. the number of trees (number of iterations)
  2. shrinkage parameter
  3. number of splits (size of each constituent trees)

My question is: for each of the tuning parameters, how should I find its optimal value ? And what method ?

Note that: the shrinkage parameter and the number of trees parameter operate together, i.e. a smaller value for shrinkage parameter leads to a higher value for the number of trees. And we need to take this into account too.

I am particularly interested in the method to find the optimal value for the number of splits. Should it be based on cross-validation or domain knowledge about the model behind?

And how are these things carried out in the gbm package in R ?

Best Answer

The caret package in R is tailor made for this.

Its train function takes a grid of parameter values and evaluates the performance using various flavors of cross-validation or the bootstrap. The package author has written a book, Applied predictive modeling, which is highly recommended. 5 repeats of 10-fold cross-validation is used throughout the book.

For choosing the tree depth, I would first go for subject matter knowledge about the problem, i.e. if you do not expect any interactions - restrict the depth to 1 or go for a flexible parametric model (which is much easier to understand and interpret). That being said, I often find myself tuning the tree depth as subject matter knowledge is often very limited.

I think the gbm package tunes the number of trees for fixed values of the tree depth and shrinkage.