Solved – CART vs. Random Forests: How do random forests prevent greediness

cartrandom forest

as I've been learning about tree-based classification algorithms, I've learned the single tree algorithms are 'greedy', meaning that they'll minimize entropy at every stage. This means that sometimes these algorithms will fail to minimize total entropy because they are only looking one step ahead. Here's an example:
enter image description here

In this case, the picture above represents the 'ideal split'. That is, every square is homogeneous. However, a decision tree algorithm might make the following split:
enter image description here

This is because the algorithm will split the data in half along the x axis to minimize initial information gain, without considering future information gain.

Now, my question is as follows:

I'm trying to understand the random forest algorithm, and I was wondering if someone could explain how the algorithm would function in this situation. How would a random forest be able to find this 'optimal split'.

Best Answer

In the fitting process of each tree inside the forest greedy decisions are still made. However, in the bagging-also called voting-i.e. when the predictions are combined from all the trees into a single prediction, each tree gets an equal vote. So in short, the RandomForest algorithm is also greedy in the same sense as the CART algorithm. The RandomForest algorithm has a sample with replacement of the observations in the data so each tree will be slightly different. Given that CART is a non-stable algorithm (small changes of input data may lead to drastically different trees), this random sampling may alleviate the drawbacks of the greedy fitting of the CART fitting process.

Related Question