Chain Rule with a Function Depending on Functions of Different Variables

chain rulemultivariable-calculuspartial derivative

I'm currently learning multivariable calculus, and I have a few questions about the chain rule, which I just learned.

So the chain rule states, supposing we have a function f that is a function of x,y and that x,y are functions of u,v, that is,

f = f(x,y) and x = x(u,v) and y = y(u,v).

that these derivatives can be related by the following formula

$$
\frac{\partial f}{\partial u}=\frac{\partial f}{\partial x}\frac{\partial x}{\partial u}+\frac{\partial f}{\partial y}\frac{\partial y}{\partial u}
$$

$$
\frac{\partial f}{\partial v}=\frac{\partial f}{\partial x}\frac{\partial x}{\partial v}+\frac{\partial f}{\partial y}\frac{\partial y}{\partial v}
$$

My question is, how does the chain rule change when say

f = f(x,y) and x = x(u,v) and y = y(u,b)?

The variabes x and y are both functions of the variable u, but x is also a function of v while y is a function of b.

How about if say
f = f(x,y) and x = x(u,v) while y = y(a,b)?

Now f depends on x and y but x and y depend on a completely different set of variables.

Best Answer

The formulas you have stated above are not the chain rule in the original sense.

The chain rule simply says that if you compose two differentiable functions, say $f$ with $f:M\to N$ and $g$ with $g:V\to W$ where $g(V)\subseteq M$ then $(f\circ g)$ is again a differentiable function with $(f\circ g):V\to N$ and $D(f\circ g) = Df(g)Dg$. Here $Df$ denotes the (total) derivative of $f$.


In order to transfer this idea to your problem one needs to make further assumptions on the given functions $f$, $x$ and $y$.

I will assume that $f:\mathbb{R}^2\to \mathbb{R}$, with ${x\choose y}\mapsto f {x\choose y}$, $x:\mathbb{R}^2\to \mathbb{R}$ with ${u\choose v}\mapsto x{u\choose v}$ and $y:\mathbb{R}^2\to \mathbb{R}$ with ${u\choose v}\mapsto y{u\choose v}$. Note that you can't compound $f$ solely with $x$ or solely with $y$ because the images of $x$ and $y$ are not a subset of $\mathbb{R}^2$. However, you can define another function $g:\mathbb{R}^2\to \mathbb{R}^2$ with $g{u \choose v}={x(u,v) \choose y(u,v)}$. Now, under the assumption that $g$ is differentiable you can apply the chain rule as follows (where $D_x$ and $D_y$ denote the corresponding partial derivatives):

$$D(f\circ g){u \choose v} = Df\left(g{u \choose v}\right)Dg{u \choose v}=\left(\begin{array}{rr} D_xf(g{u \choose v}), & D_yf(g{u \choose v}) \\\end{array}\right) \left(\begin{array}{rr} D_ux{u \choose v} & D_vx{u \choose v} \\ D_uy{u \choose v} & D_vy{u \choose v} \\ \end{array}\right)$$

If you multiply both matrices you get:

$$D(f\circ g){u \choose v}= \left(\begin{array}{rr} D_xf(g{u \choose v})D_ux{u \choose v}+ D_yf(g{u \choose v})D_vx{u \choose v} ,& D_xf(g{u \choose v})D_uy{u \choose v}+ D_yf(g{u \choose v})D_vy{u \choose v}\\\end{array}\right).$$ The first column of the matrix is your first formula and the second column corresponds to you second formula. So this is basically what you were trying to say when you refer to applying the chain rule but in a rigorous way.

Now you want to extend the example to the case where $y$ is a function of $a,b$. This means that $y$ is defined on a different domain than $x$. As I mentioned earlier you can't simply compose $f$ with $x$ and $y$ but must define another function $g$ whose image matches the domain of $f$. The question now would be how to define this function $g$? It must be something like: $$g:\mathbb{R}^4\to\mathbb{R}^2,~g\left(\begin{array}{rrrr} u\\\ v\\\ a\\\ b\\\end{array}\right)={x(u,v,a,b) \choose y(u,v,a,b)}={x(u,v) \choose y(u,v)}.$$ Note that $x$ and $y$ are technically functions of $(u,v,a,b)$. However, the variables $a,b$ don't appear in the function $x$ and $u,v$ don't appear in $y$. You can now apply the chain rule as well to $(f\circ g):\mathbb{R}^4\to\mathbb{R}$. The difference to the case where $y$ is also a function of $u,v$ is that: $$D(f\circ g){u \choose v} =\left(\begin{array}{rr} D_xf(g{u \choose v}), & D_yf(g{u \choose v}) \\\end{array}\right)\left(\begin{array}{rr} D_ux{u \choose v} & D_vx{u \choose v} \\ D_uy{u \choose v} & D_vy{u \choose v} \\ \end{array}\right)$$ now becomes

$$D(f\circ g){u \choose v} = \left(\begin{array}{rr} D_xf(g{u \choose v}), & D_yf(g{u \choose v}) \\\end{array}\right) \left(\begin{array}{rr} D_ux{u \choose v}, & D_vx{u \choose v}, & D_ax{u \choose v} & D_bx{u \choose v}\\ D_uy{a \choose b}, & D_vy{a \choose b}, & D_ay{a \choose b}, & D_by{a \choose b}, \\ \end{array}\right)=\left(\begin{array}{rr} D_xf(g{u \choose v}), & D_yf(g{u \choose v}) \\\end{array}\right)\left(\begin{array}{rr} D_ux{u \choose v}, & D_vx{u \choose v}, &0 & 0\\ 0, & 0, & D_ay{a \choose b}, & D_by{a \choose b}, \\ \end{array}\right).$$ So multiplying both matrices will now yield four formulas.

Related Question