Solved – Decision tree split vs importance

feature selectionmachine learningpartyr

I recently created a decision tree model in R using the Party package (Conditional Inference Tree, ctree model).

I generated a visual representation of the decision tree, to see the splits and levels.

I also computed the variables importance using the Caret package.

fit.ctree <- train(formula, data=dat,method='ctree')
ctreeVarImp = varImp(fit.ctree)

I was under the impression that the order of the splits in the tree was related to the variable importance. i.e. the variable at the first split is the most important and so on. When I reviewed the importance of each variable it did not match up to the order of the splits.

Is it possible that the ctree model generated directly using Party is not the same as the one using Caret?

Is the order of importance of the variables in decision trees related to the order of the splits?

Best Answer

I am not familiar with the ctree, but in rpart or CART, the variable importance is calculated in much more complicated way than the order of the split.

A detailed information can be found here, page 11

A variable may appear in the tree many times, either as a primary or a surrogate variable. An overall measure of variable importance is the sum of the goodness of split measures for each split for which it was the primary variable, plus goodness * (adjusted agreement) for all splits in which it was a surrogate. In the printout these are scaled to sum to 100 and the rounded values are shown, omitting any variable whose proportion is less than 1%. Imagine two variables which were essentially duplicates of each other; if we did not count surrogates they would split the importance with neither showing up as strongly as it should.

Related Question