How can make sure our deep neural network is differentiable

calculusgradient descent

When we have a deep neural network, according to how much complicated that neural network is, how we can make sure that in each layer we can calculate the derivatives?( Is that differentiable or not). We may have a combination of multiplications, summation, activation functions and … but how do we know that is differentiable?

Best Answer

The "combination" of differentiable functions is usually again differentiable, so it is often easy to assert differentiability. But activation functions like ReLU are not differentiable, which means that all ReLU DNNs have losses that are, strictly speaking, not differentiable w.r.t. the weights.

However, ReLU is differentiable except for a single point, so we simply assume that we never reach those points where the loss is not differentiable. In practice, if this unlikely event happens, that we have to take the derivative of ReLU at $x = 0$, this is done by simply setting the derivative to zero (pretending $x$ is a tiny-tiny bit less than zero).

Related Question