[Math] How to calculate the partial derivative of the logistic sigmoid function

calculuslinear algebramachine learningpartial derivativestatistics

This question is based on: derivative of cost function for Logistic Regression

I'm still having trouble understanding how this derivative is calculated:
$$\frac{\partial}{\partial \theta_j}\log(1+e^{\theta x^i})=\frac{x^i_je^{\theta x^i}}{1+e^{\theta x^i}}$$

My understanding is that we can apply the chain rule and substitute u like:
$$ u = {1+e^{\theta x^i}}$$
so that:
$$ y = log(u)$$
and then calculate the derivatives like:
$$ \frac{du}{d\theta_j} = e^{x_j^i} $$
and
$$ \frac{dy}{du} = \frac{1}{u*ln(10)} $$
where $$\frac{dy}{du}$$ is calculated according to this formula:

$$\frac{d}{dx}\log_bx=\frac{1}{x\ln b}$$

Then by applying the chain rule, I would get:
$$ \frac{dy}{du} * \frac{du}{d\theta_j} = \frac{dy}{d\theta_j} = \frac{e^{x_j^i}}{u*ln(10) } = \frac{e^{x_j^i}}{{(1+e^{\theta x^i})}*ln(10) } $$

But in the comments in the selected answer from the link above, they get:

$$\frac{\partial}{\partial \theta_j}\log(1+e^{\theta x^i}) = \frac{{x^i_j}}{{e^{-\theta x^i}*(1+e^{\theta x^i})}}$$

So, am I making a mistake in my calculation? Or is there something I'm missing here?

Best Answer

I give you my calculations:

Lets's say that $x\in\mathbb{R}^n$ and $\theta\in\mathbb{R}^n$, then by chain rule

$$\frac{\partial}{\partial\theta_j}\log (1+e^{\theta x'}) = \frac{1}{1+e^{\theta x'}}\frac{\partial}{\partial\theta_j}(1+e^{\theta x'}),$$ then the derivative of a constant value is zero and the derivative of the second term by chain rule is $$\frac{\partial}{\partial\theta_j}(e^{\theta x'}) = e^{\theta x'}\frac{\partial}{\partial\theta_j}(\theta x') = e^{\theta x'}x_j$$ and therefore the solution is: $$\frac{\partial}{\partial\theta_j}\log (1+e^{\theta x'}) = \frac{e^{\theta x'}x_j}{1+e^{\theta x'}}.$$

EDIT: About your calculations, two points: first, sometimes people use $\log$ but they mean $\ln$, I do not if it is the case but you should check it. Second, when you calculate the derivative of $e^{\theta x'}$ you must apply the chain rule.

Related Question