Solved – Multi-output decision tree

cartprediction

I have a dataset of 1000 tumours described by 6 parameters (my independent variables). For each tumour I have a value of the accuracy of 8 different segmentation methods.

I would like to build a model that can predict, given the 6 parameters describing a tumour, which segmentation method would yield the highest accuracy score.
Is there any way I can do this with a decision tree, or even random forest approach?
If so, is there any software that can do that ? (SPSS seems to only deal with binary decision trees)
And if not, do you have a different suggestion?

Best Answer

If I understand your problem right, maybe the best way about is not multi output.

You are trying to predict which segmentation to use. So it seems like you can do this in two ways.

  • Give each tumor a class - the class is the segmentation that got the best accuracy score - and do class prediction. This is, I think, what you said to Peter's response. It's true that it ignores the second best method, but you may get probability measures for the class prediction being right.

  • Frame it as a regression problem of predicting the accuracy of each method. So you'd have a predicted accuracy score per class for any new tumor. And then, you'd go with that method.

Having said that, if you really want multi output prediction:

http://scikit-learn.org/stable/auto_examples/tree/plot_tree_regression_multioutput.html

http://scikit-learn.org/stable/modules/generated/sklearn.multioutput.MultiOutputRegressor.html