Solved – Calculating the K-fold cross validation confusion matrix

cartcross-validation

When I do k-fold cross validation of a decision tree, the software produces a confusion matrix for the test data. Is each instance assigned the most frequent class in the cross validation results to create the confusion matrix?

Best Answer

During the k runs of a k-fold crossvalidation, for every instance exactly one prediction is made which class the instance belongs to. The prediction is made by the model trained in the particular run.

How the prediction is made depends on the type of model, in case of decision trees the predicted class for an instance is in general the most frequent class in the leaf the particular instance belongs to.

So after the crossvalidation has finished you end up with the known class and the predicted class for every instance from which the confusion matrix can be calculated.

For more information I recommend the slides of Andrew Moore: