MATLAB: How to obtain scores as probabilistic estimates using the predict function on a fitcensemble model

fitcensemblepredictprobability-scoresscoretransformStatistics and Machine Learning Toolbox

After having run the first example in this article (<https://se.mathworks.com/help/stats/fitcensemble.html#bvdwkeh>) – "Train Classification Ensemble" I am trying to get the probability/score of an observation being in one of the two classes. However by passing this function [label,score] = predict(Mdl1,pX); the scores are quite weird;
7.0616 -7.0616
3.9722 -3.9722
….
It is possible to transform this into probabilities?

Best Answer

The reason behind the command '[label, score] = predict(Mdl1,pX)' not returning scores as probability estimates is because the 'LogitBoost' algorithm used in the model does not treat scores as probabilistic estimates. Instead, the score represents the confidence of a classification into a class, higher, being more confidence.
This is explained in the documentation link below:
<https://www.mathworks.com/help/stats/compactclassificationensemble.predict.html#bvciha4>
If you would like to get probabilistic estimate for scores, you may consider one of two options:
1) You can set the 'ScoreTransform' name-value pair in the 'fitcensemble' to 'logit'. This name-value pair transforms the score to probabilistic estimates.
https://www.mathworks.com/help/stats/fitcensemble.html#bvcj_s0-1_sep_shared-ScoreTransform
Using predict on the model then returns scores as probability values for each class.
For example:
MdlFinal = fitcensemble(X,Y,'NumLearningCycles',idxNumTrees,...
'Learners',tFinal,'LearnRate',learnRate(idxLR),'ScoreTransform','logit')
>> [~,scores] = predict(MdlFinal,X)
scores =
0.0360 0.9640
0.8054 0.1946
0.0278 0.9722
...
2) Setting the 'Method' name-value pair of 'fitcensemble' to 'Bag'.
As mentioned in the documentation link here , the 'Bag' algorithm returns scores as probability values.
The documentation link for 'Method' name-value pair can be found here: