Solved – How to interpret scikit learn classification tree

cartmachine learningscikit learn

I'm currently trying to work with scikit-learn classification tree.

I followed the example on iris dataset : http://scikit-learn.org/stable/modules/tree.html and everything is working fine.

I do not understand the final graph though. My problem is that values like [50, 0, 0] are not clear to me. Which values corresponds to which class ? I mean, if I shuffle the lines of the dataset, class 0,1,2 will not be in the same order so my tree values will change.

So does anybody knows how I could change the code in order to have the proper initials labels before the values (and if possible how to plot the proportions for each leaf ?)

Thanks,

PS : If you feel that this question is too scikit-learn based. Please edit it so it is sent to stack overflow instead.

Best Answer

The ordering of the classes in the value parameter should be deterministic and independent of the samples ordering in the training set.

Either the unique string name or integer identifier for each class is stored in the classes_ attribute of the DecisionTreeClassifier instance after a call to the fit method.

Related Question