Solved – Neural networks creates negative output

MATLABneural networksprediction

I am using a simple feedforward neural network in MATLAB to predict the output for inputs in the range [1e-5, 0.3]. (These are the activations of another network.) I am using a sigmoid function for the hidden layer, and a linear function for the output layer. Input units are 6, hidden units are 4, and the output unit consists of one neuron.
The outputs range between [58, 1696]. I normalized the outputs too and turned off the mapminmax function in order to avoid over-normalizing!
Weirdly, it creates negative outputs. Could it be because of the input range?
I would appreciate if anyone could tell me what is happening here. Any thoughts?

First I need to make an update :
I changed the number of hidden neurons to 10. Now one time it gives me negative outputs the other time positive. I cannot find the answer except relating it to the random initialization of the weights.

Here is the excerpt of my code :

clear all;
load features;load labeldata; 

net=feedforwardnet(10);  
IPF={'fixunknowns','remconstantrows'} ; OPF={'remconstantrows'};
net.trainParam.lr=0.01;
net.trainParam.max_fail=10;
NN = train(net,features,y); %y is normalized value of labeldata so it is in range[0,1]
wb = getwb(NN);
net=NN;

%% TEST THE NET (NOT WITH NEW TEST DATA BUT WITH THE TRAINING DATA, SO WE EXPECT GOOD RESULTS)
[pred_learnedFeatures]=net(features);
% scaled_out=pred_learnedFeatures*(max(labeldata)-min(labeldata))+min(labeldata);
fprintf('MSE w/o scaling %f%\n',sum(( pred_learnedFeatures- y).^2)/size(labeldata,2));
fprintf('\n MSE w scaling %f%\n',sum(( scaled_out-labeldata).^2)/size(labeldata,2));

Best Answer

One reason that this could happen is that the network hasn't converged.

Another reason could be that this particular network is not the best architecture to solve this problem.

But a direct solution to this problem would be to use an activation function in the final layer that is restricted to be in [0,1] such as the logistic function -- this will never produce negative outputs. Moreover, since you've scaled your target to be in $[0,1]$ as well, the predictions will always be in the same interval as the target.