I am using logistic in classification task. The task equivalents with find $\omega, b$ to minimize loss function:
That means we will take derivative of L with respect to $\omega$ and $b$ (assume y and X are known). Could you help me develop that derivation . Thank you so much
Best Answer
I will ignore the sum because of the linearity of differentiation [1]. And I will ignore the bias because I think the derivation for $w$, which I will show, is sufficiently similar. For what it's worth, I think the key is to really understand the chain rule [2]. You might also find these rules helpful. Let's first compute the derivatives of each of the functions separately:
$$ l(a) = \ln(a) = z $$ $$ l^{\prime}(a) = \frac{\partial z}{\partial a} = \frac{1}{\ln(e)(a)} = \frac{1}{a} $$
$$ f(b) = 1 + e^b = v $$
$$ f^{\prime}(b) = \frac{\partial v}{\partial b} = e^b $$
$$ g(c) = -yc = u $$
$$ g^{\prime}(c) = \frac{\partial u}{\partial c} = -y $$
$$ h(w) = wx = t $$
$$ h^{\prime}(w) = \frac{\partial t}{\partial w} = x $$
Composing these functions:
$$ l(f(g(h(w)))) = \ln(1 + e^{-y(wx)}) $$
$$ l^{\prime}(f(g(h(w)))) = \frac{\partial z}{\partial v} \frac{\partial v}{\partial u} \frac{\partial u}{\partial t} \frac{\partial t}{\partial w} = \frac{1}{1+e^{-y(wx)}} \times e^{-y(wx)} \times -y \times x = \frac{-yxe^{-y(wx)}}{1+e^{-y(wx)}} $$