MATLAB: Could you please help me in Artificial neural network – supervised learning

artificial neural networkgood tutorial

Artificial neural network
I have a data set and I like to know the best NN topology to use (# of hidden layers and # of nodes – currently I am using [30 50 30]). I have about 1000 samples with 20 input variables and one output.
I learned using the following code; but my test(with new data set-never seen by ANN) didn’t give me desirable output. Could your please varify my method?
%load data
inputs_bn, targets_bn;
%Normalize - Do i have to normalize the data?
[inputs,ps] = mapminmax(inputs_bn);
[targets,ts] = mapminmax(targets_bn);
HL=[30 50 30];
%inputs
%targets
% Create a Fitting Network
hiddenLayerSize = HL;
net=fitnet(hiddenLayerSize,'traingdx'); % Is this used for predictions?
% Choose Input and Output Pre/Post-Processing Functions
% For a list of all processing functions type: help nnprocess
net.inputs{1}.processFcns = {'removeconstantrows','mapminmax'};
net.outputs{2}.processFcns = {'removeconstantrows','mapminmax'};
% Setup Division of Data for Training, Validation, Testing
% For a list of all data division functions type: help nndivide
net.divideFcn = 'dividerand'; % Divide data randomly
net.divideMode = 'sample'; % Divide up every sample
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
net.trainFcn = 'trainlm'; % Levenberg-Marquardt
net.trainParam.min_grad=1e-8;
% Choose a Performance Function
%change from

%net.performFcn = 'mse'; % Mean squared error
%change from
%change to

net.performFcn='msereg';
net.performParam.ratio=0.5;
%change to
% Choose Plot Functions
% For a list of all plot functions type: help nnplot
net.plotFcns = {'plotperform','plottrainstate','ploterrhist', ...
'plotregression', 'plotfit'};
% Train the Network
[net,tr] = train(net,inputs,targets,'useParallel','yes','showResources','yes'); %trainr gave bad results
% Test the Network
outputs11 = net(inputs);
outputs=mapminmax('reverse',outputs11,ts);
errors = gsubtract(targets,outputs);
performance = perform(net,targets,outputs)
% Recalculate Training, Validation and Test Performance
trainTargets = targets .* tr.trainMask{1};
valTargets = targets .* tr.valMask{1};
testTargets = targets .* tr.testMask{1};
trainPerformance = perform(net,trainTargets,outputs)
valPerformance = perform(net,valTargets,outputs)
testPerformance = perform(net,testTargets,outputs)
% View the Network
%view(net)
% Plots
% Uncomment these lines to enable various plots.
%figure, plotperform(tr)
%figure, plottrainstate(tr)
%figure(1), plotfit(net,inputs,targets)
%figure, plotregression(targets,outputs)
figure(111), ploterrhist(errors)
%%%%%%%%
%%%%Load Test DATA
% Target_output
outputs_Test = sim(net,input_Test);
outputs_Test=mapminmax('reverse',ooutputs_Test,ts);
errors = outputs_Test - Target_output;
plot(errors)
Thanks!
Jude

Best Answer

1. It is very seldom that you will need
a. That many inputs
b. More than 1 hidden layer
c. Anywhere near that many hidden nodes.
2. Typically, if you transform your variables to zero-mean/unit-variance via ZSCORE or MAPSTD, the coefficients of a linear model will indicate which variables can probably be ignored because they are either weakly correlated to the target OR are highly correlated with other variables.
Alternatives are
a. Add squares and/or cross-products to the linear (in coefficients) model
b. Use functions STEPWISE and/or STEPWISEFIT
3. PLEASE
a. Do not post commands that assign default values.
b. Include results of applying your code to an accessible data set so
that we know we are on the same page.
c. Instead of posting your huge dataset, just pick one of the MATLAB
example sets
help nndatasets
doc nndatasets
4. For the purpose of reproducibility, initialize the RNG before obtaining the random initial weights and random trn/val/tst data division.
5. I have posted many tutorials that emphasize minimizing the number of hidden nodes, H, to obtain better performance on non-training (validation, test and unseen) data.
6. Basically, you would like the number of unknown weights
Nw = (I+1)*H+(H+1)*O
to be much less than the number of training equations
Ntrneq = round(0.7*N*O) % default approx.
A necessary condition is
H <= Hub = floor((Ntrneq-O)/(I + O +1))
However H << Hub is preferable.
With I = 20, O = 1, N = 1000
Ntrneq = Ntrn = 700
Hub = 45
7. My tutorials will explain how to perform a double loop search for
a. No. of hidden nodes
b. Initial RNG state (reproducible initial weights & datadivision).
8. For regression, search on subsets of
greg fitnet tutorial Ntrials
Hope this helps.
Greg