Solved – Cross entropy loss function and division by zero

entropyneural networks

I'm trying out the cross entropy loss function for neural network training, per the arguments at as to why it's better than mean squared error.

However, I'm getting division by zero errors leading to infinite weights. Looking at the formula for it e.g. as implemented in the tiny-cnn library,

class cross_entropy_multiclass {
    static float_t f(float_t y, float_t t) {
        return -t * std::log(y);

    static float_t df(float_t y, float_t t) {
        return -t / y;

in one sense this is not surprising, as it will give division by zero every time the current output of a neuron, y, happens to be zero.

In another sense it is surprising; if this were a known problem with cross entropy loss, I would expect it to be mentioned in some of the discussion I looked at.

Am I doing something wrong, or is there some sort of bug in tiny-cnn, or what else am I missing?

Best Answer

Your output neuron should be a sigmoid function (which ensures your values are between 0 and 1 exclusive)