MATLAB: What is the meaning of loss() and predict() in the case of random forest

random forest

I'm training a random forest classifier with the following command:

rusTree = fitensemble(trainX,trainY,'RUSBoost',1000,t,'LearnRate',0.1,'nprint',100);

I'm then curious about how well the model works, and ran the following commands:

L = loss(rusTree,testX,testY,'mode','cumulative');
[label,score] = predict(rusTree,testX);

With label and the testY, I evaluated the amount of True positives and False positives in my data. Since L is the classification error, I expected that my evaluation and L would have some connection. However, they seem to be irrelevant. What exactly am I getting out of predict() and loss()?

Thank you!

June.

Best Answer

From your question I understand that you are trying to fit a random forest classifier on your training data. I assume that ' trainY' contains categorical label since it is a classification task.

The answer to your question is in two folds. In your code

L = loss(rusTree,testX,testY,'mode','cumulative');

L will be having cumulative loss. This vector indicates the loss of the model when you add trees to your ensemble one by one cumulatively.This is very different from your evaluation of false positive and negatives from prediction label,since predict function uses all the trees at a time.You can plot the loss to see the trend

 figure;
 plot(loss(bag,Xtest,Ytest,'mode','cumulative'));
 xlabel('Number of trees');
 ylabel('Test classification error');

This trend will give you the idea of how your loss is varying as the number of trees grow.

MATLAB invokes different functions depending on the function signature(Function overloading).Function signature is determined by number of inputs, types of inputs,number of outputs.' predict' function has multiple definitions.In your code

[label,score] = predict(rusTree,testX);

there is a chance that predict function corresponds to predicting image category( doc_link ) getting invoked. In that case 'score' would be having negated average binary loss per class. Please be careful in your function signature.

I would recommend you to use following code to do your evaluation.

L = loss(rusTree,testX,testY);
[label,score,node,cnum] = predict(rusTree,testX);

You can refer to this link for more details. I hope this will answer your question.

Regards,

Darshan Bhat

Related Solutions

MATLAB: Check for missing argument or incorrect argument data type in call to function ‘predict’.

For cross-validated classification trees, create a classification partitioned object, https://www.mathworks.com/help/stats/classreg.learning.partition.classificationpartitionedmodel-class.html using fitctree https://www.mathworks.com/help/stats/fitctree.html#bt6cr7t-tree and use kfoldPredict https://www.mathworks.com/help/stats/classificationpartitionedmodel.kfoldpredict.html

You are using 'KFold' so you are creating a classification partitioned object and need to use kfoldPredict() instead of predict()

MATLAB: How to get the predicted probabilities from “AdaBoostM1” score

The "score" given by "predict" function is a matrix of classification scores indicating the likelihood that a label comes from a particular class, using any of the input arguments in the previous syntaxes.

You could refer to Compact Classification Ensemble "predict" function documentation for more details: https://www.mathworks.com/help/releases/R2020a/stats/compactclassificationensemble.predict.html

As for "AdaBoostM1", you can convert scores into probabilities by assigning string "doublelogit" to the ScoreTransform property of the ensemble object.

That translates to the following code snippet:

>> mdl = fitcensemble(X,Y, "Method", "AdaBoostM1", "ScoreTransform", "doublelogit")
>> [label, problikescore] = predict(mdl,x)

Best Answer

Related Solutions

MATLAB: Check for missing argument or incorrect argument data type in call to function ‘predict’.

MATLAB: How to get the predicted probabilities from “AdaBoostM1” score

Related Question