Solved – way to explain a prediction from a random forest model

machine learningrandom forest

Say I've got a predictive classification model based on a random forest (using the randomForest package in R). I'd like to set it up so that end-users can specify an item to generate a prediction for, and it'll output a classification likelihood. So far, no problem.

But it would be useful/cool to be able to output something like a variable importance graph, but for the specific item being predicted, not for the training set. Something like:

Item X is predicted to be a Dog (73% likely)
Because:
Legs=4
Breath=bad
Fur=short
Food=nasty

You get the point. Is there a standard, or at least justifiable, way of extracting this information from a trained random forest? If so, does anyone have code that will do this for the randomForest package?

Best Answer

First idea is just to mimic the knock-out strategy from variable importance and just test how mixing each attribute will degenerate the forest confidence in object classification (on OOB and with some repetitions obviously). This requires some coding, but is certainly achievable.

However, I feel it is just a bad idea -- the result will be probably variable like hell (without stabilizing impact of averaging over objects), noisy (for not-so-confident objects the nonsense attributes could have big impacts) and hard to interpret (two or more attribute cooperative rules will probably result in random impacts of each contributing attributes).

Not to leave you with negative answer, I would rather try to look at the proximity matrix and the possible archetypes it may reveal -- this seems much more stable and straightforward.