Solved – How are individual trees added together in boosted regression tree

boostingcartmachine learning

I'm reading Introduction to Statistical Learning, James, G., et al. (2013), in which they describe the Boosted Regression Tree algorithm as following. What I do not understand is Eq 8.10 and 8.11. What do "adding the new tree to the old tree" & "update the residual", as signified by $\leftarrow$, mean mathematically?

enter image description here

The working guide to boosted regression tree by the same authors do not explain how exactly to add the trees either.

Best Answer

They assume that you're keeping track of a "current estimator" $\hat f$, which is a sum of all the trees you've seen so far. (In code you would just store this as an array of all the trees you've seen so far.) The $\leftarrow$ sign just means "takes the new value"--so when they say "add the new tree" they mean, basically, append the new tree to the array of trees you already store, so that where you previously would have computed $\hat f$ with that array, you now compute $\hat f + \lambda \hat f^b$. (The $\leftarrow$ sign just means "takes the new value".)

The residual is just the difference between the response and your current prediction $\hat f$. So if you add something to $\hat f$ you need to subtract it from the residual so that they continue to sum up to the target response. Again, the $\leftarrow$ sign just means "takes the new value"--so $r_i \leftarrow r_i - \lambda \hat f^b(x_i)$ would translate in code to r[i] -= lambda * tree_prediction[i] or something.