MATLAB: Error to train a decision tree model

classificationdecision treeimage processingmachine learning

I want to classify the images with decision tree classifier ….. but I got an error. I am attaching a code for your kind consideration and attention. your prompt reply will be much appreciated thank you. How to change the train-data to a numeric matrix.
-- Decision tree--
load('train_highimg_fea.mat','a');
load('train_lowimg_fea.mat','b');
load('test_highimg_fea.mat','c');
load('test_lowimg_fea.mat','d');
train_data = [a; b];
train_label = ones(400,1);
train_label(201:400) = -1;
% Train the classifier
TreeMdl = fitctree(train_data ,train_label);
test_data = [c;d];
test_label = ones(150,1);
test_label(101:150) = -1;
predict_labletree= predict(TreeMdl,test_data);
save('predict_labletree.mat');
—Command Window— Error–
Error using
classreg.learning.FullClassificationRegressionModel.prepareDataCR
(line 171)
X must be a numeric matrix.
Error in ClassificationTree.prepareData (line 512)
classreg.learning.FullClassificationRegressionModel.prepareDataCR(...
Error in classreg.learning.FitTemplate/fit (line 213)
this.PrepareData(X,Y,this.BaseFitObjectArgs{:});
Error in fitctree (line 194)
this = fit(temp,X,Y);
Error in decesiontree_test (line 15)
TreeMdl = fitctree(train_data ,train_label);

Best Answer

Yes, this is to be expected.
You create a vector of length 48000 for each input image. Your images are 240 * 80, and you calculate a 5 x 8 cell of 240 * 80 for each of the images, and you reduce the 240 * 80 by a factor of 4 in each direction, and put the rest together in a vector. So you have 5 * 8 * (240/4) * (80/4) = 5 * 8 * 60 * 20 = 48000 vector. Which, incidentally, is 2 1/2 times longer than the original image.
So each of your 20 images per "high" group is associated with a vector of length 48000. Your 10 images per "low" group are processed through smote() to create data equivalent to another 10 images, so you have the equivalent of 20 images worth for the "low" group, again of length 48000.
Your code loads these together and constructs a 40 x 1 cell of those 48000 x 1 vectors.
And then you are stuck, because you have to train on a numeric array, not on a cell array.
Now, the first input, X, to fitctree, is described as
"Predictor data, specified as a numeric matrix. Each row of X corresponds to one observation, and each column corresponds to one predictor variable."
You seem to have 40 observations (images), and 48000 predictor variables, so what you probably need for X is
X = horzcat(train_data{:)).';
which will give you a 40 x 48000 numeric array.
But then your code constructs 400 training labels, indicating that you think you have 400 input images rather than 40. It would make more sense to me if you had
train_label = [ ones(length(a),1));
-ones(length(b),1))];
Your test_data would be
Y = horzcat(test_data{:}).';
Related Question