%% 1. Importing data % Matrix of 2003x1 each are daily stock market indices data of Nifty & Sensex
> load Nifty.dat;
> load Sensex.dat;
% To scale the data it is converted to its log value:
> lognifty = log(Nifty);
> logsensex = log(Sensex);
> X = tonndata(lognifty,false,false);
> T = tonndata(logsensex,false,false);
%% 2. Data preparation
% Input and target series are divided in two groups of data: % 1st group: used to train the network inputSeries = X(1:end-N); % targetSeries = T(1:end-N); % 2nd group: this is the new data used for simulation. inputSeriesVal will % be used for predicting new targets. targetSeriesVal will be used for % network validation after prediction
Notation:
data = design + test
design = training + validation
Val subsets are used repetetively with Trn subsets to DESIGN a net with a good set of training parameters (e.g., input delays, feedback delays, number of hidden nodes, stopping epoch, etc). The best of multiple designs is, typically, based on indirectly minimizing MSEval.
After the best design is chosen, the nondesign Test subset is used to estimate generalization performance on nondesign data.
By DEFAULT, the data will be divided RANDOMLY into THREE trn/val/tst subsets according to
dividerand( 2003, 0.7, 0.15, 0.15 )
I disagree with the use of dividerand for uniformly spaced time-series. Replace with one of the other divide functions. (When Nval=Ntst =0, I use 'dividetrain'. Otherwise I use , 'divideblock' or 'divideind' to maintain uniform spacing);
> inputSeriesVal = X(end-N+1:end);
> targetSeriesVal = T(end-N+1:end);
Change "Val" to "Test" since the subsets are only used for performance evaluation (NOT "validation") and not design.
Since a NNTBX BUG will not allow a test subset without a validation subset and visa versa, there are two options
1. Use trn/val/tst (Nval=Ntst = 300) and 'divideblock' or 'divideind'
(recommended)
2. a.Remove the tst subset (Ntst = 300) from training,
b. Do not use a val set (Nval=0)
c. Use 'dividetrain' to only train on training data(Ntrn = 1703).
d. Calculate the test subset performance separately
%% 3.
> Network Architecture delay = 2;
> neuronsHiddenLayer = 50;
Use the autocorrelation function to determine the significant feedback delays. Use the crosscorrelation function to determine the significant input delays.
% Network Creation
> net = narxnet(1:delay,1:delay,neuronsHiddenLayer);
%% 4. Training the network
> [Xs,Xi,Ai,Ts] = preparets(net,inputSeries,{},targetSeries);
[ net tr Ys Es Xf Af ] = train(net,Xs,Ts,Xi,Ai);
tr = tr
> view(net)
> Y = net(Xs,Xi,Ai);
% one-step-ahead prediction
> perf = perform(net,Ts,Y);
%% 5. Multi-step ahead prediction
>inputSeriesPred =[inputSeries(end-delay+1:end),inputSeriesVal];
>targetSeriesPred = [targetSeries(end-delay+1:end), con2seq(nan(1,N))];
>netc = closeloop(net);
>view(netc)
Check netc on previous data. If performance is bad, improve it by training netc on the previous data.
>[Xs,Xi,Ai,Ts] = preparets(netc,inputSeriesPred,{},targetSeriesPred);
>yPred =netc(Xs,Xi,Ai);
>perf = perform(net,yPred,targetSeriesVal);
>figure;
>plot([cell2mat(targetSeries),nan(1,N);
> nan(1,length(targetSeries)),cell2mat(yPred);
> nan(1,length(targetSeries)),cell2mat(targetSeriesVal)]')
>legend('Original Targets','Network Predictions','Expected Outputs')
% Network predictions are coming very bad.. I guess there is some problem % with the close loop's initial input states and initial layer states. % please help.
1. Optimize ID and FD
2. Use trn/val/trn with 'divideblock' or 'divideind'
3. Compare netc and net performance on openloop data
4. If necessary, use train on netc.
5. Then consider nondesign data
Hope this helps
Thank you for formally accepting my answer
Greg
Best Answer