[Math] What’s the “trick” in log derivative trick

logarithmspartial derivative

The following is often referred to as the "log derivative trick".

$$\frac{\nabla_\theta p(X,\theta)}{p(X, \theta)} = \nabla_\theta \log p(X,\theta)$$

For example here, here, and several other places (usually in reference to reinforcement learning)

Is it not just calculus? $\frac{\partial}{\partial x} \log f(x) = \frac{f'(x)}{f(x)}$
Is there anything else going on here?

Best Answer

It's a "trick", when you use it to calculate $\nabla_\theta p(X,\theta)$ via the (hopefully, sometimes) easier expression $\log p(X,\theta)$. So the use is to write it as $$ \nabla_\theta p(X,\theta)=p(X,\theta)\,\nabla_\theta\log p(X,\theta), $$ in cases where the right-hand-side is easier than the left-hand-side. Typically, when $p$ has lots of products and exponents.