I am using a tanh as my activation function for my NN. I also was using the cross entropy cost function previously when I had sigmoid neurons. The sigmoid neurons can never make it to zero but a tanh can and when I train the NN I will get division by zero errors. I switched back to the quadratic cost function but it converges slowly. Is there a way to use the cross entropy cost with a tanh or is there something better I could use?
Solved – Division by zero with cross entropy cost function
neural networks
Related Question
- Solved – Cross entropy loss function and division by zero
- Solved – Neuron saturation occurs only in last layer or all layers
- Solved – ln(0) when we use cross entropy cost function for training a neural network
- Neural Networks – Can the Cross Entropy Cost Function Be Used with Tanh?
- Solved – How is the Cross-Entropy Cost Function back-propagated
Best Answer
It's common to use softmax as a final layer. It helps you to convert the output values to the probabilities. If you use softmax as an activation function for the final layer you can use any function you like for the previous layers.