MATLAB: How to only use training set to train Neural Network using toolbox with “divideInd” option

Deep Learning Toolboxneural networkneural networks

Hello all, currently I am working with the Neural Network toolbox. I used the "Generate Advanced Script" option at the end and made some modifications as to how the network is to divide up my data set into training, validation, and testing. I changed the default option from "dividerand" to "divideInd" and I specified which indices I wanted to be in training, validation, and testing. However, it seems that the training process is using the ENTIRE data set instead of exclusively using the training set I specified earlier. Is there a way around this? Also is there a way to check the confusion matrix for EACH individual set? (meaning confusion matrix using only training, validation, testing set) Below are modifications made to the code generated from NN toolbox:

inputs = datainput; targets = targetvalues;

% Create a Pattern Recognition Network

hiddenLayerSize = 35;

net = patternnet(hiddenLayerSize);

% Choose Input and Output Pre/Post-Processing Functions

% For a list of all processing functions type: help nnprocess

net.inputs{1}.processFcns = {'removeconstantrows','mapminmax'};

net.outputs{2}.processFcns = {'removeconstantrows','mapminmax'};

% Setup Division of Data for Training, Validation, Testing

% For a list of all data division functions type: help nndivide

% *MODIFICATIONS MADE HERE***

net.divideFcn = 'divideind'; % Divide data using indicies

%net.divideMode = 'sample'; % Divide up every sample

net.divideParam.trainInd = 1:300;

net.divideParam.valInd = 301:360;

net.divideParam.testInd = 361:420;

% For help on training function 'trainlm' type: help trainlm

% For a list of all training functions type: help nntrain

net.trainFcn = 'trainlm'; % Levenberg-Marquardt

% Choose a Performance Function

% For a list of all performance functions type: help nnperformance

net.performFcn = 'mse'; % Mean squared error

% Choose Plot Functions

% For a list of all plot functions type: help nnplot

net.plotFcns = {'plotperform','plottrainstate','ploterrhist', …

'plotregression', 'plotfit'};

% Train the Network

[net,tr] = train(net,inputs,targets);

% Test the Network

outputs = net(inputs);

errors = gsubtract(targets,outputs);

performance = perform(net,targets,outputs)

% Recalculate Training, Validation and Test Performance

trainTargets = targets .* tr.trainMask{1};

valTargets = targets .* tr.valMask{1};

testTargets = targets .* tr.testMask{1};

trainPerformance = perform(net,trainTargets,outputs);

valPerformance = perform(net,valTargets,outputs);

testPerformance = perform(net,testTargets,outputs);

% View the Network

%view(net);

% Plots

% Uncomment these lines to enable various plots.

%figure, plotperform(tr)

%figure, plottrainstate(tr)

%figure, ploterrhist(errors)

Thank you

Best Answer

 > How to only use training set to train Neural Network using toolbox with "divideInd" option
 > Asked by Gary 22 minutes ago
 > Latest activity Edited by Gary 15 minutes ago
 > Hello all, currently I am working with the Neural Network toolbox. I used the "Generate Advanced Script" option at the end and made some modifications as to how the network is to divide up my data set into training, validation, and testing. I changed the default option from "dividerand" to "divideInd" and I specified which indices I wanted to be in training, validation, and testing.

Your technique results in unnecessary specifications of too many net properties that are already defaults. Concentrate on the defaults that have to be overridden. For example, if you have a classification or pattern recognition problem, first use the default number of hidden nodes and omit the ending semicolon to obtain

net = patternnet % No semicolon

The resulting command line printout will reveal the defaults. You can then concentrate on the defaults you want to override.

It is also useful to run the code examples in the documentation

 help patternnet
 doc patternnet

Be sure the target matrix consists of unit vector columns with a single "1". The row index of the "1" denotes the class index of the corresponding input column. The relationship between the target matrix and the class indices is given by

 target = ind2vec(trueclassindex)
 trueclassindex = vec2ind(target)

Again, omitting some of the ending semicolons will reveal useful information.

 % However, it seems that the training process is using the ENTIRE data set instead of exclusively using the training set I specified earlier. Is there a way around this?

You are confused.

 data = design + test            % test == evaluation
 design = train+ validation    % validation ~= evaluation

The train function designs with design data and evaluates with nondesign test data. It trains with training data, but uses the nontraining validation data to stop training if the validation error does not decrease for max_fail consecutive epochs. Finally, it evaluates the net with nondesign test data.

The separate trn/val/tst performances can be obtained using the training record tr via

[net tr y e ] = train(net,input,target); tr = tr % NO SEMICOLON

For examples, search the NEWSGROUP and ANSWERS using

patternnet greg

% Also is there a way to check the confusion matrix for EACH individual set? (meaning confusion matrix using only training, validation, testing set)

Yes. Call each separately after using the indices to separate the cases. For an example search

confusion greg

Hope this helps.

Thank you for formally accepting my answer

Greg

Best Answer

Related Solutions

MATLAB: Neural network (fitnet) and data decomposition

MATLAB: Neural Network Plotting Confusion Matrix

Related Question