When using rpart to create classification tree, the values for the relative importance of each predictor show up along these lines:
Var1: 33
Var2: 31
Var3: 25
Var4: 3
In my case Var3 is plotted as the root node. I expected that Var1 would have been the root node, given that it has the highest relative importance. Based on this, would it be reasonable to expect that Var1-3 would show up more and/or higher up towards the root of the tree? That question also applies to decision trees in general.
Thanks
Best Answer
Variable 3 is the predictor that provides the most separation in the two nodes after a single binary split. Predictors might show up multiple times further down the tree resulting greater overall importance.