Solved – The probability of outcome produced by a classification decision tree

cartmachine learningrandom forest

As far as I know, the outcome of classification tree is the class that the input observation belongs to.

Is it possible to obtain the probability that the input observation belongs to the predicted class? Is there another other way to compute this probability?

I'm not able to use the regression tree because my target variable is discrete.

Best Answer

There are two common methods to get more-or-less continuous predictions.

  1. Don't build the trees out to purity, but instead enforce some sort of minimum node size, maximum depth, or minimum impurity to split. The effect is that some leafs (terminal nodes) will be impure, contributing some fraction to the predicted outcome.
  2. Your post doesn't mention random forests except in a tag, but if you fit $n$ trees, you have $n$ binary predictions. The $k/n$ predictions of a specific class gives you a value in $[0,1],$ representing the forest's confidence that a sample belongs to the class. (This method can also be combined with 1.)