Derive unknown partial derivative of function composition from known partial derivative

chain rulefunction-and-relation-compositionpartial derivative

Suppose that we have a composition of two functions that we want to find a partial derivative of. Specifically, let $f$, $g$, and $h$ be functions, such that for all $x \in \mathbb R$ and $y \in \mathbb R$, $$h(x, y) = g(f(x, y)).$$

Let $\phi$ be some function that takes two inputs (e.g., two real numbers). We are given the partial derivative of $h$ with respect to $x$, $$\frac{\partial h(x, y)}{\partial x} = \phi\left(f(x,y), \frac{\partial f(x, y)}{\partial x}\right),$$ and we would like to use this to find the partial derivative of $h$ with respect to $y$, $\frac{\partial h(x, y)}{\partial y}.$

Since the function $h$ and the given definition of $\frac{\partial h(x, y)}{\partial x}$ are both independent of $x$ and $y$ given $f(x, y)$, it seems intuitive to argue that we can simply "swap" the partial derivative of x for that of y as follows:

$$\frac{\partial h(x, y)}{\partial y} = \phi\left(f(x,y), \frac{\partial f(x, y)}{\partial y}\right)?$$

Does this statement follow? If so, how can we formally justify this? Perhaps something similar to the chain rule gives this result? (This result does not follow directly from the chain rule but the idea seems very similar.) If the argument is not valid, a counterexample would be appreciated. Thank you!

Not necessary for answering the question, but may be helpful for building intuition:

Someone I showed this to asked, "why not just apply the chain rule?" For intuition on what kind of problem might require a statement like the one above, instead of a simple application of the chain rule, see the following question, which is a special case of the question above: Multivariable Chain Rule: Conditional Independence and "Swapping" Partial Derivatives. In that question, the given expression for $\frac{\partial h(x, y)}{\partial x}$ is complicated; it involves an expectation, a summation, and random variables, and does not follow from a simple application of the the chain rule. So being able to leverage the given expression for $\frac{\partial h(x, y)}{\partial x}$ to get a similar expression for $\frac{\partial h(x, y)}{\partial y}$ would be useful.

Best Answer

The statement does not follow in general. A counterexample:

$g: \mathbb R \to \mathbb R$ is defined as $g(x) := x$.

$f: \mathbb R \times \mathbb R \to \mathbb R$ is defined as $f(x, y) := x + y^2$.

$\phi: \mathbb R \times \mathbb R \to \mathbb R$ is defined as $\phi(x, y) := 1$.

Notice that $\frac{\partial h(x,y)}{\partial x} = \frac{\partial g(f(x,y))}{\partial f(x,y)}\frac{\partial f(x,y)}{\partial x} = 1(1) = 1 = \phi(f(x,y), \frac{\partial f(x,y)}{\partial x}),$ so the "given" statement holds.

However, $\frac{\partial h(x,y)}{\partial y} = \frac{\partial g(f(x,y))}{\partial f(x,y)}\frac{\partial f(x,y)}{\partial y} = 1(2y) = 2y$. This is not equivalent to $\phi(f(x,y), \frac{\partial f(x,y)}{\partial y}) = 1.$ Therefore, the proposed statement above does not follow.

Related Question