Solved – Neural network failing because of Infinity

functionneural networks

I have different activation functions in my network. I have noticed my networks failing (producing NaN). The reasoning behind this is:

I have a large layers with average weights at start, so some neurons get large values as input
Softplus activation function outputs Infinity from ~800
Sinusoid/Softsign/Bent identity produces NaN as output to (-)Infinity

How can I stop this from happening? Shoud I put limits on inputs (e.g. Max(10e15, input).

Also, the derivative of the complementary log-log activation function also returns Infinity rather quickly (even though the output of the function is limited), causing the backpropagation algorithm to fail. I solved this by returning 0 if x > 800.

And last of all, I have nodes with the Absolute activation function. When these nodes are selfconnected, their activations will infinitely keep getting larger -> after a while they will output Infinity as well.

Best Answer

There are several possible reasons. Try the following things and report:

normalize the input : actually, this is nearly a standard thing that should be done with neural networks, most of all if you are using sinusoid activation functions

Apply transformations on the input : this depends of course strongly on your data and how it is distributed, but you may apply a log on the input data. I guess this could help quite a lot in your case! Remember that you basically can apply any transformation on the input as long as you apply it every time. In general, bijective functions are prefered (compared to, say, max()) as no real loss of information happens.

Change your solver, change your cost-function : There are a lot of solvers and some cost-functions. Try them! For example, adam.

May change your activation functions : relu usually performs quite well, leakyRelu can be an option (or similar relus).

Related Solutions

Solved – Weight Size in Neural Networks

I realise this is an old post but perhaps this answer will be useful for others.

Firstly, reinforcement learning is based on the idea of searching for the best long term reward. That is why, in a Q learning algorithm, we update the Q values (or 'goodness' values') for each state-action pair to be equal to the reward received plus some fraction (the rate of decay/gamma) of the predicted future reward. In this way, your algorithm could be converging on good Q values that are considering both expected immediate reward plus potential future rwards.

That being said, if your neural network is indeed diverging, then there are a number of things that you can do to help your algorithm converge. My immediate advice would be to use Double Deep Q learning, whereby you introduce a second neural network and copy the weights from your current neural network every so often (less often than the current network is updated) and use this to provide value predictions for the future state.

So for a neural network that takes a state (your input values) and outputs a range of values inside a list (the indices of which correspond to different actions). This is how you would get each new input, target pair to train your model on:

action = action_the_agent_did_in_this_memory_from_state_to_new_state

target = model.predict(state) #this gives a list of values for each action in the current state
future_target_one = model.predict(new_state) #this gives a list of values for each action in the next state as predicted by your current model
future_target_two = target_model.predict(next_state) #this gives a list of values for each action in the next state as predicted by your target model
best_future_action_index = np.argmax(future_target_one) #this gives the index of the maximum value action in the future state using the current model
best_future_action_value = future_target_two[best_future_action_index] #this sets best_future_action_value equal to the value from the target model (using the index from the current model)

#if this is the last move before the game ends, then there is no future reward
if done:
    targets[action] = reward #the value of targets at the index equal to the action (such as action 0 perhaps) is set equal to the reward
else: #otherwise one must consider the future rewards too
    targets[action] = reward_t + GAMMA * best_future_action_value

This idea is used to decouple the index and value from each other in the value predictions to help prevent problems with overestimation. I hope this helps and you weren't put off by my super-long variable name.

Neural Networks – Is It Normal That a Neural Network Sometimes Doesn’t Learn XOR?

Yes.

There are 16 local minimums that have the highest conversion if the weights are initialized between 0.5 and 1.

Image source: Yoshio Hirose, Koichi Yamashita, Shimpei Hijiya, "Back-propagation algorithm which varies the number of hidden units," Neural Networks, Volume 4, Issue 1, (1991)

A similar question including an implementation in tflearn

Best Answer

Related Solutions

Solved – Weight Size in Neural Networks

Neural Networks – Is It Normal That a Neural Network Sometimes Doesn’t Learn XOR?

Related Question