[Math] Generalized power rule for derivatives

derivativesexponentiation

Background

This background is not really necessary to answer my question, but I included it here to provide context.

This question has some programming aspects to it as well, but since my question is mainly about math, I decided to ask it here.

I'm trying to extend the implementation of automatic differentiation found here. This implementation, assuming I read it properly, does not work for functions of the form $F(x)=f(x)^{g(x)}$. I'm trying to modify it so that it does work for such functions.

Question

I'm trying to find a derivative for functions of the form $F(x)=f(x)^{g(x)}$. I specifically only care about the "normal" cases where $g(x)$ is an integral constant, or $f(x)$ is positive. Wikipedia has provided me with a "Generalized Power Rule":

$$(f^g)^\prime = f^g\left(f^\prime\frac{g}{f}+g^\prime\ln f\right)$$

The generalized rule however does not work for $f\leq 0$. In my implementation it is difficult to tell which of the two cases I'm working with, so I would rather not need to implement this generalized rule for the latter case, and the basic power rule for the former.

Is there a rule that works for both cases?

Best Answer

You have two conflicting goals here. If $y$ is arbitrary, then $x^y$ only makes sense for $x>0$. Imagine, for example, that $y = \frac{1}{2}$. Then $x^y = \sqrt{x}$ - what does that mean for negative $x$?

Note that switching to complex numbers doesn't help much - negative numbers do have square roots then, but those are non-unique, and what's worse, the number of solutions is highly dependent on $y$! E.g., $y^n = x$ has $n$ solutions in $\mathbb{C}$. Which is $x^\frac{1}{n}$ supposed to be?

So you'll have to distinguish between two cases. One is $f(x)^{g(x)}$ for positive $f$, and the other is $f(x)^k$ for constants $k \in \mathbb{Z}$ (i.e., no fractional exponents). You could generalize the second case to $f(x)^{g(x)}$ for functions $g$ which take only integral values, but since such functions are either constant or non-continuous, that case isn't really interesting for purposes of differentiation, I think.

BTW, a far more interesting (and maybe solvable!) question is how to deal with non-negative $f$, which nevertheless may take the value zero. $f(x)^{g(x)}$ is perfectly well-defined for those, but you'll still run into problems with the logarithm. Now, in some cases these problems are due to the fact that the derivative does, in fact, not exists at these points. But not in al cases! For example, $f(x) = x^2$ has derivative $0$ at $x=0$. The reason is, basically, that since $g$ is constant in this case, then $g' ln f$ doesn't matter, because $g' = 0$, and similarly for $f'\frac{g}{f}$. But you can't just cancel things that way in all cases - that will produce wrong answers sometimes, because it actually depends on how fast things go to zero respectively infinity.


You might ask, then, why the non-uniqueness mentioned above doesn't prevent us from sensibly defining $\sqrt[x]{x}$ - after all, $y^n = x$ has two solutions for positive $x$ even in \mathbb{R}$. The reason is twofold

  1. The number of solutions doesn't explode as badly. We have one solution of $y^n = x$ for odd $n$, and two for even $n$.

  2. There's an order on $\mathbb{R}$, which makes the definition of $\sqrt[n]{x}$ as the (unique!) positive solution of $y^n = x$ quite natural.

The effect of (1) and (2) is, for example, that while it's not true that $\sqrt[n]{x^n} = x$, we do get at least that $\sqrt[n]{x^n} = |x|$. Trying to do the same over the complex numbers fails horribly. We could attempt to define $\sqrt[n]{x}$ as the solution of $y^n =x$ with the smallest angle (assuming we agree to measure angles counter-clockwise from the real axis). But then an $n$-th root always has an angle smaller than $\frac{2\pi}{n}$, so $\sqrt[n]{x^n}$ and $x$ would have very little in common except that their $n$-th power is $x^n$.

Related Question