Solved – Working for Logistic regression partial derivatives

backpropagationderivativegradientlogistic

In Andrew Ng's Neural Networks and Deep Learning course on Coursera the logistic regression loss function for a single training example is given as:

$$ \mathcal L(a,y) = – \Big(y\log a + (1 – y)\log (1 -a)\Big) $$

Where $a$ is the activation of the neuron.

The following slide gives the partial derivatives, including:

$$\frac{\delta \mathcal L(a, y)}{\delta a} = – \frac y a + \frac{1-y}{1-a}$$

Why isn't the 2nd term negative, ie: $ – \frac{1-y}{1-a}$, given the negative outside the brackets in the definition of $\mathcal L(a, y)$?

Best Answer

Using the Derivative Calculator I found out that I wasn't applying the chain rule.

$$\begin{align}\tfrac d {da} log(1-a) &= \dfrac{1}{1-a}\cdot \tfrac{\mathrm{d}}{\mathrm{d}a}\left[1-a\right] \\ & = \dfrac{1}{1-a}\cdot -1 \end{align}$$

Best Answer

Related Solutions

Solved – Intuition behind Backpropagation gradients

Solved – Backpropagation proof and usage confusion

Related Question