Chain Rule – Simple Proof Using ?y/?x = dy/dx|x=x1 + k

calculusdifferential-operatorslimits

In an online lecture (link to Youtube), the professor proves the Chain Rule using the following statement:

$$\frac{\Delta y}{\Delta x} = \frac{dy}{dx}\biggr|_{x=x_1} + k$$

$$\Delta y = \frac{dy}{dx}\biggr|_{x=x_1} \Delta x+ k\Delta x$$
where $$\lim_{\Delta x \to0}k=0.\\[0.4in]$$

Then he says that $x$ and $y$ are functions of a third variable $t$, so:

$$\begin{align*}
\frac{\Delta y}{\Delta t} &= \frac{dy}{dx}\biggr|_{x=x_1} \frac{\Delta x}{\Delta t}+ k\frac{\Delta x}{\Delta t}\\[0.2in]
\lim_{\Delta t \to 0}\frac{\Delta y}{\Delta t} &=\lim_{\Delta t \to 0} \frac{dy}{dx}\biggr|_{x=x_1} \lim_{\Delta t \to 0}\frac{\Delta x}{\Delta t}+ \lim_{\Delta t \to 0}k \lim_{\Delta t \to 0}\frac{\Delta x}{\Delta t}\\[0.2in]
\frac{dy}{dt} &= \frac{dy}{dx} \frac{dx}{dt}+ 0\cdot\frac{dx}{dt}
\end{align*}$$

The part I don't understand is, when he says that the last term in the last line, is not $0$ because of $k$, but also because as $\Delta t \to 0$, $dx$ approaches $0$ as well.

But why? I don't understand how $dx$ approaches zero as $dt$ approaches $0$? Shouldn't it be the other way around? That $dx/dt$ should increase without bound as $dt$ approaches $0$?

Also, another question… how are we allowed to simply define $\lim_{\Delta x \to0}k=0$? Although it makes sense graphically, but since $k$ is a number, doesn't that violate the fact that $\lim_{\Delta x \to0}c=c$? is it something to do with $\Delta x$?

Best Answer

Ultimately, I think the reason is that $x$ is a differentiable function of $t$ which particularly means that as $\Delta t\rightarrow 0$, $\Delta x\rightarrow 0$. That said I really don't like this approach to the chain rule. It is a bit cobbled together.


I feel that this is the better way to do chain rule (which is much clearer from the get go, I think). Let $f$ and $g$ be differentiable and $g$ non-constant. If $g$ is the constant function, then clearly $(f\circ g)'$ is zero since $f\circ g$ would be constant. So trivially it is true that $(f\circ g)' = f'(g(x))g'(x)$ since $g' = 0$. I'll assume they're both defined on all of $\mathbb{R}$ to avoid annoyances with domains and ranges.

Then we wish to evaluate $(f\circ g)'(x)$. From the definition of the derivative, we have

$$ (f\circ g)'(x) = \frac{d}{dx}f(g(x)) = \lim_{\Delta x\rightarrow 0} \frac{f(g(x+\Delta x))-f(g(x))}{\Delta x}.$$

Let's multiply by a clever form of $1$ to make things easier on ourselves, particularly we'll multiply by

$$\frac{g(x+\Delta x)-g(x)}{g(x+\Delta x)-g(x)}.$$

This is where I required that $g$ be non-constant. If it were constant, the above expression would make no sense since we would be dividing $0$ by $0$. Note that this resembles the terms inside of $f$. This is not by accident. Additionally we have that $g(x+\Delta x)\approx g(x)+g'(x)\Delta x$ for small $\Delta x$ (this is what the derivative is for - linear approximations). If $g'(x) = 0$, then $g(x+\Delta x)\approx g(x)$ and in this case, we have

$$\lim_{\Delta x\rightarrow 0}\frac{f(g(x+\Delta x))-f(g(x))}{\Delta x} = 0.$$

Again, this is clearly equal to $f'(g(x))g'(x)$ since $g' = 0$. If $g'(x)\neq 0$, we can make use of our clever form of $1$ to get

$$(f\circ g)'(x) = \lim_{\Delta x\rightarrow 0}\frac{f(g(x+\Delta x))-f(g(x))}{g(x+\Delta x)-g(x)}\frac{g(x+\Delta x)-g(x)}{\Delta x}.$$

Now both pieces look eerily like a derivative (which is what we want), except that the first piece has $g$ in it. However if $\Delta x\rightarrow 0$, we know that $g(x+\Delta x)\rightarrow g(x)$ since $g$ is differentiable (and therefore continuous). Clearly $\lim_{\Delta x\rightarrow 0}\frac{g(x+\Delta x)-g(x)}{\Delta x}$ exists since $g$ is differentiable so we need only to argue that $\lim_{\Delta x\rightarrow 0}\frac{f(g(x+\Delta x)-f(g(x))}{g(x+\Delta x)-g(x)}$ is well-defined. By limit theorems we know that if both limits exists, we can distribute the limit to each piece and evaluate.

So we want to argue that

$$\lim_{\Delta x\rightarrow 0}\frac{f(g(x+\Delta x))-f(g(x))}{g(x+\Delta x)-g(x)}$$

is well-defined. Using our approximation for $g(x+\Delta x)$ from above, we have

$$\lim_{\Delta x\rightarrow 0}\frac{f(g(x)+g'(x)\Delta x)-f(g(x))}{g(x)+g'(x)\Delta x-g(x)}.$$

Cancelling appropriate terms we have

$$\lim_{\Delta x\rightarrow 0}\frac{f(g(x)+g'(x)\Delta x)-f(g(x))}{g'(x)\Delta x}.$$

Repeating the same logic as above with $f(g(x)+g'(x)\Delta x)$, we have that $f(g(x)+g'(x)\Delta x)\approx f(g(x))+f'(g(x))g'(x)\Delta x$. And so we get

$$\lim_{\Delta x\rightarrow 0}\frac{f(g(x))+f'(g(x))g'(x)\Delta x-f(g(x))}{g'(x)\Delta x} = f'(g(x)).$$

Since the limit of the first piece makes sense and the limit of the second piece makes sense, we can distribute the limits to get that

$$(f\circ g)'(x) = \lim_{\Delta x\rightarrow 0}\frac{f(g(x+\Delta x))-f(g(x))}{g(x+\Delta x)-g(x)}\lim_{\Delta x\rightarrow 0}\frac{g(x+\Delta x)-g(x)}{\Delta x} = f'(g(x))g'(x)$$

by our above calculations. So in each case that emerged we had that $(f\circ g)'(x) = f'(g(x))g'(x)$ and so we conclude that the chain rule holds.

Related Question