Solved – R gbm – handling of missing values


I am trying to understand how gbm handles missing values.

I have seen this thread on the topic:

But it focusses on explaining how the results show how missing values are treated. What I am interested in is how the algorithm treats missing values when fitting the trees. E.g. does it consider a missing value to contain information, or does it essentially ignore that feature?

I have not been able to find this information online so any responses would be much appreciated.

Best Answer

Update - the gbm package builds trees with three splits (left node, right node, and missing node). Therefore the model treats the missing values as a separate group.

This is explained in the gbm.object documentation, in the section on c.splits:

Related Question