Solved – ln(0) when we use cross entropy cost function for training a neural network

cross entropyneural networks

In the online book of Nielsen the cross entropy cost function is given as below:

$$ C = -\frac{1}{n} \sum_x [y \ln a+(1−y)\ln(1−a)] $$

When $a$ is equal to 1 the last ln becomes $ln(0)$. And that is undefined (Well, if you take the limit from right side, it converges to negative infinity). Then the cost will be undefined or infinity.

Isn't this a problem? How is this dealt with? Do we just assume $a$ will never be exactly 1?

Best Answer

Ok. The answer came to me itself. The output $a $ is obtained by a sigmoid function, e.g. logistic function. These functions become 1 only at infinity. In practice they never become 1.