MATLAB: Reproducibility in neural network

MATLABneural networkoverfitting/classificatonrandom

I'm trying to breakdown the MATLAB neural network GUI by working out what each feature does. I'm keeping it simple by using the default training method (scg), and the MATLAB wine dataset for training/testing. For the time being, and for experimentation, I've removed the validation dataset, and I've set the NN up with 50 hidden nodes.

What I can't work out is why the results it produces are exactly the same each time. It takes exactly the same amount of epochs to get to the minimum gradient, performance and gradient values are exactly the same, and the results produced in the confusion matrix are exactly the same. The only thing I can think of is that the data splitting and initialisation of weights are not randomised, but everywhere I look online suggests that (by default) MATLAB does indeed randomise those parameters.

What am I missing? Are the weights and datasets not randomised after all? Code being used is below.

% Load MATLAB default wine dataset.
[x1,t1] = wine_dataset;
% Create net, 50 hidden nodes.
net = patternnet(50);
% Split the data into a 75% training and 25% testing group. Validation
% removed.
net.divideParam.trainRatio = 3/4;
net.divideParam.valRatio = 0;
net.divideParam.testRatio = 1/4;
% Train the data.
train(net,x1,t1);

Best Answer

YIKES!!! You have entered the creepy world of

 (TRUMPETS PLEASE!)
             OVERTRAINING AN OVERFIT NET!!!

You can prevent the overtraining by

1. Using a validation set. Look at the performance plot
and see the drastic log-scale difference in performance 
between the training and testing subset performances.
 2. Using regularization. With regression this means 
 replacing the performance function MSE with MSEREG 
 which is something like
 MSEREG = MSE + lambda * norm(weights)

Therefore, if you use large weights or , more likely, too many weights due to too many hidden nodes, training will be terminated earlier.

However with classification, using patternnet, the default performance measure is CROSSENTROPY. I am not sure if this is MATLAB compatible with regularization.

3. Use the Bayesian Regularization training function TRAINBR which by default, uses Nval = 0 and a form of MSEREG. HOWEVER, I'm not sure if this is MATLAB compatible with CROSSENTROPY.

4. Instead of preventing overtraining, you can prevent overfitting by just using fewer hidden nodes:

 [x  t]  = wine_dataset;
 [ I N ] = size(x)        % [13 178 ]
 [O N ]  = size(t)        % [ 3 178 ]
 vart    = mean(var(t',1))% 0.21944
  Ntst   = round(0.25*N)  % 45
  Ntrn   = N-Ntst         % 133
  Ntrneq = Ntrn*O         % 399 training equations

5. When the net is configured with H = 50 hidden nodes, the number of unknown weights will be

Nw  = (I+1)*H+(H+1)*O  % 853 unknown weights

which is more than twice the number of training equations !!!

 ==> OVERFITTING! 
H   = 50
net = patternnet(H);
Nw  = net.numWeightElements % 50 when unconfigured
net = configure(net,x,t);
Nw  = net.numWeightElements % 853 when configured

Note: Training will automatically configure an unconfigured net

To avoid overfitting

 Nw <= Ntrneq <==> H <= Hub
 Hub = (Ntrneq-O)/(I+O+1) % 23.294

Therefore H <= 23 avoids overfitting

 net.divideParam.testratio = 3/4;
 net.divideParam.valratio  = 0;
 net.divideParam.testratio = 1/4;
 [ net tr y e ] = train(net,x,t);
 % y = net(x); e = t-y  % error
 NMSE = mse(e)/vart     % 0.017875
 Rsq  = 1- NMSE         %  0.98213

Therefore, the net models 98.2% of the average target variance.

However, the net is overfitted. Therefore, the difference between the test and training performances is very important.

Moreover, the net is a classifier. Therefore, the difference between the training and test performances in terms of CROSSENTROPY and CLASSIFICATION RATE is more important!

 indtrn = tr.trainInd;
 indval = tr.valInd   %   Empty matrix: 1-by-0
 indtst = tr.testInd;
 TO BE CONTINUED

Hope this helps.

Thank you for formally accepting my answer

 Greg

Related Solutions

MATLAB: Normalize Inputs and Targets of neural network

MAPMINMAX can be inappropriate only if there are outliers.

Use MAPSTD to detect outliers that can be removed or modified.

Hope this helps.

Thank you for formally accepting my answer

Greg

MATLAB: I m working with JAFFE dataset.how to divide dataset into training and testing for neural network i m giving LBP feature of JAFFE images as a input.i have a code which divide 70% -30% ratio but how can i know that this 70% is for training,30%testin

As the above code indicates, the default is 70/15/15 . To change it, just change the percentages in the above assignment statements. HOWEVER, I do not recommend changing it !!!

If you use 70/0/30, you are flirting with the potentially disastrous possibility of overtraining an overfit net (i.e., No. of unknown weights, and thresholds, Nw, exceeds the number of training equations Ntrneq)

To mitigate this without a validation set for a regression/curve-fitting net like FITNET, use MSEREG instead of MSE and/or TRAINBR instead of TRAINLM and/or No. of training equations Ntrneq >> No. of unknown weights and biases Nw

HOWEVER, it is not clear to me what to do for a classification/pattern-recognition net like PATTERNNET that uses CROSSENTROPY instead of MSE and TRAINSCG instead of TRAINLM.

   [ I N ] = size(input)
   [ O N ] = size(target)  
 Ntrneq = Ntrn*O 
 Nw      = (I+1)*H+(H+1)*O

Obviously want

 H << (Ntrn*O -O)/ (I+O+1)

Need size(input) to continue for overfitting check

Hope this helps

Thank you for formally accepting my answer

Greg

Best Answer

Related Solutions

MATLAB: Normalize Inputs and Targets of neural network

MATLAB: I m working with JAFFE dataset.how to divide dataset into training and testing for neural network i m giving LBP feature of JAFFE images as a input.i have a code which divide 70% -30% ratio but how can i know that this 70% is for training,30%testin

Related Question