Solved – Neural network failing because of Infinity

functionneural networks

I have different activation functions in my network. I have noticed my networks failing (producing NaN). The reasoning behind this is:

  • I have a large layers with average weights at start, so some neurons get large values as input
  • Softplus activation function outputs Infinity from ~800
  • Sinusoid/Softsign/Bent identity produces NaN as output to (-)Infinity

How can I stop this from happening? Shoud I put limits on inputs (e.g. Max(10e15, input).

Also, the derivative of the complementary log-log activation function also returns Infinity rather quickly (even though the output of the function is limited), causing the backpropagation algorithm to fail. I solved this by returning 0 if x > 800.

And last of all, I have nodes with the Absolute activation function. When these nodes are selfconnected, their activations will infinitely keep getting larger -> after a while they will output Infinity as well.

Best Answer

There are several possible reasons. Try the following things and report:

normalize the input : actually, this is nearly a standard thing that should be done with neural networks, most of all if you are using sinusoid activation functions

Apply transformations on the input : this depends of course strongly on your data and how it is distributed, but you may apply a log on the input data. I guess this could help quite a lot in your case! Remember that you basically can apply any transformation on the input as long as you apply it every time. In general, bijective functions are prefered (compared to, say, max()) as no real loss of information happens.

Change your solver, change your cost-function : There are a lot of solvers and some cost-functions. Try them! For example, adam.

May change your activation functions : relu usually performs quite well, leakyRelu can be an option (or similar relus).