MATLAB: Logsig activation function in irradiance post-processing

anndivideintfitnetfitnettutoriallogsigtransfer function

Hello,
I have irradiance and temperature forecasts, and I'm trying to improve the irradiance forecasts using a neural network for my Master's thesis. For that, I'm using fitnet function (is this MLP?). I'm currently testing one and two hidden layer networks with different sizes.
My question is mainly about the activation functions in the hidden layers and in the output layer. I have normalized the irradiance (both the forecasted and the targets) and the temperature, and they are ranging from 0 to 1 (at least for irradiance, it is the normalization range used – it doesn't make sense to use from -1 to 1). As such, I have removed mapminmax from preprocessing:
net.input.processFcns = {'removeconstantrows'};
net.output.processFcns = {'removeconstantrows'};
It makes sense, right?
Additionally, having read the NN's User's Guide, I saw that the default transfer function is tansig, which outputs in the range of [-1; 1], and changed it to logsig [0; 1]. I read in the guide that "…if you want to constrain the outputs of a network (such as between 0 and 1), then the output layer should use a sigmoid transfer function (such as logsig).". My problem is that I don't see different results when using tansig and logsig. I actually think (this was not thoroughly tested yet) that logsig in the output layer provides slightly worse results. Do these results make sense? And does it make sense to use logsig?
net.layers{1}.transferFcn = 'logsig';
net.layers{2}.transferFcn = 'logsig';
net.layers{3}.transferFcn = 'logsig'; %If using 2 hidden layers
Also, is it important (and even possible) to "tell" the network the the input1 is irradiance, same as the only output? (I mean, should the network know that I have G and T as inputs and G as output, or it is completely irrelevant and it treats it like X and Y as inputs and Z as output?).
One last question: I have used
net.divideFcn = 'divideint'; %Interleaved division
Does this guarantee that the in all trainings the test set is composed of the same elements (for example, entries 7, 14 and 21 are always used in test)?
I'm sorry for the long post, I really hope someone can enlighten me! If it matters, I'm attaching my data and code.
Thank you,
Bernardo Fonseca

Best Answer

You have overthought the design. You probably only have to change the random initial weights and/or the number of hidden nodes in a single hidden layer.
1. See the documentation example.
help fitnet
doc fitnet
2. Make the following modifications ( note the omitted semicolons):
[ x ,t ] = simplefit_dataset ;
[ I N ] = size(x)
[ O N ] = size(t)
vart1 = mean( var( t', 1 ))% Reference MSE
% vart1 = var( t, 1 ) when O = 1
net = fitnet; % H = 10 is the default
rng('default') % Initialize the RNG for reproducibility
[ net tr y e ] = train( net, x, t );
% y = net(x); e = t - y;


view(net)
NMSE = mse(e)/vart1 % Normalized MSE

Rsq = 1- NMSE % Fraction of target variance modeled by net
% See https://en.wikipedia.org/wiki/R2
3. Use the code, as is, with your example. If Rsq is not close to
unity, train Ntrials = 10 examples in a loop to obtain different
random initial weights and random trn/val/tst datadivisions
...
for i = 1:Ntrials
net = configure( net, x, t); % New initial weights
[ net tr y e ] = train( net, x, t );
% y = net(x); e = t - y;
view(net)
NMSE(i) = mse(e)/vart1 % Normalized MSE
end
Rsq = 1- NMSE
4. If this is not successful, vary the number of hidden nodes in an outer loop:
rng('default')
j = 0
for h = Hmin:dH:Hmax
j = j+1
if h == 0
net = fitnet( [] );
else
net = fitnet( h );
end
for i = 1:Ntrials
net = configure( net, x, t);
[ net tr y e ] = train( net, x, t );
% y = net(x); e = t - y;
% view(net) Probably don't need this now
Rsq(i,j) = 1- mse(e)/vart1;
end
end
I have posted zillions of eamples. Search both the NEWSGROUP and ANSWERS using
fitnet tutorial
fitnet greg
Hope this helps.
Thank you for formally accepting my answer
Greg