Calculus – Why the Correct Proof of the Chain Rule is Correct

calculuschain rulederivatives

There is a correct and an incorrect proof going around when it comes to the Chain Rule (see below). The problem with the incorrect proof is that $g(x)-g(a)$ might be $0$ if $x\to a$ creating a division by zero.

Question

I can't get my head around why the correct proof solves the problem of the incorrect proof. Why can we just define a function $E$ and suddenly all our problems disappear?

I just don't really get what actually happens in the correct proof. It just didn't "click" in my brain yet. Any help would be much appreciated.

By the way; is my "correct proof" below indeed correct?

Incorrect proof:

$$\lim \limits_{x \to a}\frac{f(g(x))-f(g(a))}{x-a}=\lim \limits_{x \to a}\frac{f(g(x))-f(g(a))}{g(x)-g(a)}\times\frac{g(x)-g(a)}{x-a}=f'(g(x))g'(x)$$

Correct proof:

We first define a function $E$

$$E(0)=0$$
$$E(g(x)-g(a))=\frac{f(g(x))-f(g(a))}{g(x)-g(a)}-f'(g(x))$$

In any case:
$$f(g(x))-f(g(a))=(E(g(x)-g(a))+f'(g(x)))\times(g(x)-g(a))$$

Dividing by $x-a$ and taking the limit we get:

$$\begin{align}
\frac{d}{dx}f(g(x))&=\lim \limits_{x \to a}\frac{f(g(x))-f(g(a))}{x-a}\\
&=\lim \limits_{x \to a}(E(g(x)-g(a))+f'(g(x)))\times\frac{g(x)-g(a)}{x-a}\\&=f'(g(x)g'(x)
\end{align}$$


EDIT: In other words: we basically state that when $g(x)=g(a)$:

$$\frac{f(g(x))-f(g(a))}{g(x)-g(a)}-f'(g(x))=0$$

But why can we state that? As I understand it, this is true for the limit, but why are we allowed to also state it for the actual value?

Best Answer

There are two things wrong with your original proof, and the "EDIT" section of the original post is also wrong.

First problem: To define a function $E$, you have to say how to apply $E$ to an arbitrary number $h$. You haven't done that. Here is a better definition of $E$: $E(0) = 0$, and if $h \ne 0$ then \begin{equation} E(h) = \frac{f(g(a)+h) - f(g(a))}{h} - f'(g(a)). \end{equation} For $h \ne 0$, the formula defining $E(h)$ can be rearranged to read: \begin{equation} (E(h) + f'(g(a))) \times h = f(g(a)+h) - f(g(a)). \end{equation} But notice that this last equation is also true if $h=0$, since both sides are $0$, so the equation is true for all values of $h$. Plugging in $g(x)-g(a)$ for $h$, we get \begin{equation} (E(g(x)-g(a))+f'(g(a))) \times (g(x)-g(a)) = f(g(x))-f(g(a)). \end{equation} This is (almost) the same as your "in any case" equation.

Second problem: In your final calculation, you are mixing up the derivative with the value of the derivative at a particular point. The limit \begin{equation} \lim_{x \to a} \frac{f(g(x))-f(g(a))}{x-a} \end{equation} doesn't give you the derivative, it gives you the value of the derivative at $a$. So the proof should end like this: \begin{align} \left.\frac{d}{dx}f(g(x))\right|_{x=a} &= \lim_{x \to a} \frac{f(g(x))-f(g(a))}{x-a}\\ &= \lim_{x \to a} (E(g(x)-g(a))+f'(g(a))) \times \frac{g(x) - g(a)}{x-a}\\ &= f'(g(a))g'(a). \end{align}

There is a subtle point in the last step that you may be missing. Since $g$ is differentiable at $a$, it is continuous at $a$, so $\lim_{x \to a} (g(x) - g(a)) = g(a)-g(a) = 0$. But why does it follow that $\lim_{x \to a}E(g(x)-g(a)) = E(0) = 0$? The answer is: because $E$ is continuous at $0$. (Look in your calculus book in the section on continuous functions. You will find a theorem that says that if $\lim_{x \to a} f(x) = L$ and $g$ is continuous at $L$, then $\lim_{x \to a} g(f(x)) = g(L)$. That theorem is being used in this step.) So to have a complete proof, you need to verify that $E$ is continuous at $0$. To verify that, check that $\lim_{h \to 0} E(h) = 0 = E(0)$. In this limit, $h$ is approaching $0$ but it is not equal to $0$, so we can use the formula for $E(h)$ when $h \ne 0$: \begin{equation} \lim_{h \to 0} E(h) = \lim_{h \to 0} \left(\frac{f(g(a)+h)-f(g(a))}{h} - f'(g(a))\right) = f'(g(a))-f'(g(a)) = 0. \end{equation}

Finally, the problem with the "EDIT" section of the original post: You seem to think that by defining $E$, we are somehow changing the meaning of the expression \begin{equation} \frac{f(g(x))-f(g(a))}{g(x)-g(a)}. \end{equation} We are not. That expression still means what it meant before, so it is undefined when $g(x) = g(a)$. All we're doing is defining a new function $E$, and it is only formulas involving the letter $E$ whose meaning is affected by that definition. No justification is needed for this--you can define a new function however you want.

Related Question