For random forest, we split the node by Gini impurity or entropy for a set of features. The RandomForestClassifier in sklearn, we can choose to split by using Gini or Entropy criterion. However, what I read about Extra-Trees Classifier, a random value is selected for the split (I guess then there is nothing to do with Gini or Entropy). The ExtraTreesClassifier from sklearn has the option to choose Gini or Entropy for the split. I am a little bit confused here.
Solved – How are random forest and extremely randomized trees split differently
classificationpythonrandom forestscikit learn
Related Question
- Solved – WEKA: Visualize combined trees of random forest classifier
- Solved – Difference between Random Forest and Extremely Randomized Trees
- Solved – Entropy Impurity, Gini Impurity, Information gain – differences
- Solved – What makes a Random Forest random besides bootstrapping and random sampling of features
Best Answer
One iteration of Random Forest:
One iteration of Extremely Randomized Trees:
Select $m$ features randomly as a candidate set of splitting features
Within each of these features $F_i$, with $i \in {1, ...,m}$ draw a single random cutpoint uniformly from the interval $(min(F_i), max(F_i))$. Evaluate the performance of this feature with this cutpoint with respect to Gini / Entropy / whatever measure