So here is my problem:
I am using a cross-validated SVM classifier to classify between two classes, 'A' vs 'B'. After training the classifier, I transform and predict the validationPredictions using the following function:
pcaTransformationFcn = @(x) [ array2table((table2array(varfun(@double, x(:, ~isCategoricalPredictorBeforePCA)))... - PCACenters) * PCACoefficients), x(:,isCategoricalPredictorBeforePCA) ]; svmPredictFcn = @(x) predict(ScoreTransformedSVM.Trained{1:end}, x); validationPredictFcn = @(x) svmPredictFcn(pcaTransformationFcn(x));
The score values I get however, do not make much sense to me. If an observation is predicted to be class a for example, the score for that particular class can for be +2 and the score for class b, that was not predicted would be -2.
I don't get this issue with CVKNN or CVDiscriminant classifiers. By this i mean, these two models produce liklihood scores (if an observation is predicted to be class a, it's liklihood score is for example, 0.9 and the score for class b that was not predicted would be 0.1). This is what I want for CVSVM.
My question is, how do I transform the scores so that I get the probabilities rather than the score SVM puts out?
Here's what I have tried:
% Train a classifier
classificationSVM = fitcsvm(... ext.TrainingPredictors, ... ext.TrainingResponse, ... 'KernelFunction', 'polynomial', ... 'PolynomialOrder', polyOrder, ... 'KernelScale', 'auto', ... 'BoxConstraint', 1, ... 'Standardize', true, ... 'ClassNames', classNames, ... 'KFold',10); ScoreTransformedSVM=fitSVMPosterior(classificationSVM); % Create the result struct with predict function
pcaTransformationFcn = @(x) [ array2table((table2array(varfun(@double, x(:, ~isCategoricalPredictorBeforePCA)))... - PCACenters) * PCACoefficients), x(:,isCategoricalPredictorBeforePCA) ]; svmPredictFcn = @(x) predict(ScoreTransformedSVM.Trained{1:end}, x); validationPredictFcn = @(x) svmPredictFcn(pcaTransformationFcn(x)); % Compute validation predictions
ext.validationPredictors; ext.validationResponse; [validationPredictions, validationScores] = validationPredictFcn(ext.validationPredictors);
The documentation suggests "You can estimate test-sample, posterior probabilities by passing ScoreCVSVMModel to kfoldPredict. See (https://uk.mathworks.com/help/stats/fitsvmposterior.html). The problem is, kfoldPredict only accepts the model as an input.
How can I estimate the validation sample posterior probabilities if I can't pass the validationPredictors into the function?
Am I missing something?
I get the desired out come if I don't use a CVSVM classifier, however, is there a way to get the liklihood score or posterior probabilities of the validation data?
If you would like a copy of my data or anymore information please let me know.
Thanks in adavnce!
Best Answer