MATLAB: Stochastic gradient descent neural network updating net in matlab

Deep Learning Toolboxgradient descentnetneural networktraining

Is it possible to train (net) as stochastic gradient descent in matlab. If possible how?

I observe that it completely ignores the previous trained data's information update the complete information. It will be helpful for large scale training. If I train the complete data, it takes very long time.

For example train iteratively 100 part of the data.

TF1 = 'tansig';TF2 = 'tansig'; TF3 = 'tansig';% layers of the transfer function , TF3 transfer function for the output layers
net = newff(trainSamples.P,trainSamples.T,[NodeNum1,NodeNum2,NodeOutput],{TF1 TF2 TF3},'traingdx');% Network created
net.trainfcn = 'traingdm' ; %'traingdm';
net.trainParam.epochs   = 1000;
net.trainParam.min_grad = 0;
net.trainParam.max_fail = 2000; %large value for infinity
while(1) // iteratively takes 10 data point at a time.
 p %=> get updated with following 10 new data points

 t %=> get updated with following 10 new data points
   [net,tr]             = train(net, p, t,[], []);
end

Best Answer

1. http://en.wikipedia.org/wiki/Stochastic_gradient_descent

2. Use the largest nndataset in the NNTBX for an example

help nndataset
doc nndataset

3. It is worthwhile to look at static correlation coefficients (help/doc corrcoef) and plots to help find

a. inputs that are so weakly correlated to all of the targets that  those inputs can be omitted.
b. inputs that are so highly correlated with other inputs that they can be omitted

4. It may be useful to look at the input dimensionality reduction obtained with linear models (help regress)

5. Try to use as many defaults as possible when starting a NN design. Defaults that should be overridden should become evident during design trials.

6. What are the dimensions of your input and target matrices?

7. How many hidden nodes?

8. It is not necessary to use more than one hidden layer.

9. I used the largest nndataset

 [ x,t] = building_dataset;

with size(x) = [ 14 4208], size(t) = [ 3 4208 ] and H = 70 hidden nodes. This yields about 10 times more training equations,3*4208= 12,624 ,than there are unknown weights (14+1)*70+(70+1)*3 = 1263.

Since the net was not close to being overfit, I only used a training set and obtained an adjusted Rsquared of 0.99 in 72 seconds with a straight forward FITNET design.

However a design by looping over 10 randomly chosen subsets took 109 seconds. The syntax after the random shuffling using randperm(4208) was

 M    = 420 % floor(4208/10)
 imax = 10 
 for i=1:imax
    k = 1+M*(i-1) : M*i;
    [ net tr y( : , k ) ] = train( net, x( : , k ), t( : , k ) );
 end

This probably doesn't show a savings because 14*4208 is not too large for the default trainlm.

I think all you have to do is use a larger data set (enough to choke trainlm) and a more appropriate training function , e.g., trainscg or trainrp.

Hope this helps.

Thank you for formally accepting my answer

Greg

Related Solutions

MATLAB: How to train a feedfordward neural network with error weights

 Use the command window
 help mse
 doc mse

Hope this helps.

Thank you for formally accepting my answer

Greg

MATLAB: How to get best result of plot confusion figure

Quick answer: Also use the confusion function and test on one or more MATLAB nndataset examples

NOTE: I have removed some of the ending semicolons so that you can see the output in the command window

 help/doc confusion     % Sometimes doc is more informative than help
 help/doc plotconfusion
 help/doc nndatasets

Two good examples for this question are

 help/doc simpleclass_dataset
 help/doc iris_dataset        % Notice the error: replace 1000 with 150

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%55

   clear all, close all, clc
   [ x, t ] = iris_dataset;
   [ I  N ] = size(x)  % [ 4 150 ]
   [ O  N ] = size(t)  % [ 3 150 ]

% Make sure that columns of t are columns of eye(3)

minmaxt = minmax(t) 
% minmaxt =      0     1
%                0     1

%                0     1
innert = find(0 <  t & t  < 1) % Empty matrix: 0-by-1
H   = 2 % No. of hidden nodes (default H=10 is overfitting)
net = patternnet(H);
rng(0)
[ net tr y e ] = train(net,x,t);  % e = t-y
[c,cm,ind,per] = confusion(t,y)
plotconfusion(t,y);

% NOTE: You may want to type tr = tr to see what is in the training record

Hope this helps.

Thank you for formally accepting my answer

Greg

Best Answer

Related Solutions

MATLAB: How to train a feedfordward neural network with error weights

MATLAB: How to get best result of plot confusion figure

Related Question