I am currently trying to run a kfold cross validation on a decision tree with a custom classification loss function, as described here.
However, I don't understand how the C and S matrices, which are passed to the loss function, are helpful.
1. Under the link it says "C is an n-by-K logical matrix with rows indicating which class the corresponding observation belongs." So this is not predicted and hence a repetition of the input data?
2. The S matrix. "S is an n-by-K numeric matrix of classification scores". Why can I not simply use the predicted classifications instead of the scores?
To be more specific: I create a classification decision tree. Next, I use crossval to get a partitionedModel. Then, I calculate the validation accuracy by using kfoldLoss. Now, instead of using the in built 'classiferror' function, I would like to use my own classification loss function, e.g. matthews correlation coefficient.
% create set of cross-validated classification model(s) from a classification model
partitionedModel = crossval(trainedClassifier.ClassificationTree, 'KFold', 10);% Loss, by default the fraction of misclassified data, is a scalar and averaged over all folds
validationAccuracy = 1 - kfoldLoss(partitionedModel, 'LossFun', 'classiferror');
Any help is greatly appreciated.
Best Answer