Loss Functions – How to Calculate Derivative of Cross Entropy Loss Function?

derivativeloss-functions

I have a cross entropy loss function.

$$
L = -{1 \over N} \sum_i {y_i \cdot \log {1 \over {1+e^{-\vec x \cdot \vec w}}} + (1-y_i) \cdot \log (1-{1 \over {1+e^{-\vec x \cdot \vec w}}})}
$$

I want to calculate its derivative, aka $ \nabla L = {\partial L \over \partial w}$.

How to do that?

Best Answer

$$\nabla L = \begin{pmatrix} \frac{\partial L}{\partial w_1} \\ \frac{\partial L}{\partial w_2} \\ \vdots \\ \frac{\partial L}{\partial w_n} \end{pmatrix}$$

This requires computing the derivatives of the terms like

$$\log {1 \over {1+e^{-\vec x \cdot \vec w}}} = \log {1 \over {1+e^{-(x_1 \cdot w_1 + x_2 \cdot w_2 + \, \dots \, + x_n \cdot w_n)}}}$$

where you can use

$$\frac{\partial}{\partial x} \left( \log\frac{1}{1+e^{-(a+bx)}} \right) = \frac{b}{1+e^{(a+bx)}}$$

and

$$\frac{\partial}{\partial x} \left( \log(1-\frac{1}{1+e^{-(a+bx)}} ) \right) = \frac{b}{1-e^{-(a+bx)}}$$


Filling that in you get

$$\frac{\partial}{\partial w_j} L = \frac{\bar{y_i} x_j}{{1+e^{\vec x \cdot \vec w}}} - \frac{(1-\bar{y_i}) x_j }{{1-e^{-\vec x \cdot \vec w}}}$$

and

$$\nabla L = \left( \frac{\bar{y_i}}{{1+e^{\vec x \cdot \vec w}}} - \frac{(1-\bar{y_i}) }{{1-e^{-\vec x \cdot \vec w}}} \right) \vec{x} $$