On the ‘wrong proof’ of the chain rule

analysischain rulederivativesreal-analysissolution-verification

I am looking through an old analysis course that I had and I was pondering a bit about the proof of chain rule (especially the notorious wrong proof that you can give). I'd be happy if someone was willing to verify my reasoning below. I end with an actual question.

Let's start with the following nice result.

Let $f\colon \mathbb{R}\to \mathbb{R}$ be a continuous function which is differentiable on $\mathbb{R}_0$. Assume that $\lim_{x\to 0}f'(x)=L\in \mathbb{R}$. Then $f$ is differentiable in $0$.

Proof: For each $h\neq 0$, the mean value theorem yields a $c_h\in \mathbb{R}$ strictly between $h$ and $0$ such that $f'(c_h)=\frac{f(h)-f(0)}{h}$. Letting $h\to 0$, is it is obvious that $c_h\to 0$ as each $|c_h|<|h|$. Hence $$\lim_{h\to 0}\frac{f(h)-f(0)}{h}=\lim_{h\to 0}f'(c_h)=L.$$ $\square$

Great, let's apply this to the following function: $$\phi\colon \mathbb{R}\to \mathbb{R}:x\mapsto \begin{cases}x^3\sin(\frac{1}{x}) & \mbox{ if }x\neq 0,\\0 & \mbox{ if } x=0.\end{cases}$$
Clearly $\phi$ is differentiable on $\mathbb{R}_0$ and $$\phi'(x)=3x^2\sin(\frac{1}{x})-x^3\cos(\frac{1}{x})\frac{1}{x^2}=3x^2\sin(\frac{1}{x})-x\cos(\frac{1}{x})$$ for all $x\neq 0$.
It is straightforward to see that $\lim_{x\to 0}\phi'(x)=0$ and thus the above result yields that $\phi'(0)=0$ (in particular $\phi$ is differentiable on the whole of $\mathbb{R}$).

Now at this point, recall the chain rule.

Let $f,g\colon \mathbb{R}\to \mathbb{R}$ be functions. If $a\in\mathbb{R}$ such that $f'(a)$ and $g'(f(a))$ both exist, then $(g\circ f)'(a)=g'(f(a))f'(a)$.

The obvious argument to try is the following 'wrong proof':

\begin{eqnarray}
\lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{x-a} &=& \lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{f(x)-f(a)}\cdot \frac{f(x)-f(a)}{x-a}\\
&=& \lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{f(x)-f(a)}\cdot \lim_{x\to a}\frac{f(x)-f(a)}{x-a}\\
&=& g'(f(a))f'(a).
\end{eqnarray}

Here we used that $f$ is continuous in $a$ to see that $f(x)\to f(a)$ as $x\to a$. $\triangle$

However, there is an obvious error in the above reasoning. If for example $f$ is a constant function $f(x)=f(a)$ for all $x\in \mathbb{R}$, then $\lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{f(x)-f(a)}=\lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{0}$ is nonsensical!

Having said that, it is also clear that the above proof does work for functions such that $\exists \delta>0:\forall x\in (a-\delta,a+\delta)\setminus \{a\}:f(x)\neq f(a)$. In that case, $f(x)$ does not equal $f(a)$ for $x$ near $a$ (and $x\neq a$). So the above proof only fails for a particular type of function, the easiest of which are constant functions. However, for a constant function $f$, one can calculate $(g\circ f')(a)$ directly and show that it's $0$.

A natural question at this point is to wonder whether there exists a nonconstant function $f$ such that $f$ is differentiable in $a$ and $f(x)=f(a)$ infinitely often for $x$ near $a$. The answer is yes and the function $\phi$ given in the example above (with $a=0$) satisfies these properties. (Also, the wikipedia page of the chain rule gives the function $f(x)=x^2\sin(\frac{1}{x})$ for $x\neq 0$ and $f(0)=0$ as an example, but this function is not differentiable in $0$. As far as I can tell, this is a worse example than just a constant function to pinpoint the failure of the 'wrong proof'. Perhaps this should be changed?)

In general let $f$ be such a function (thus $\forall \delta>0:\exists x\neq a: |x-a|<\delta$ and $f(x)=f(a)$). If $\lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{x-a}$ exists, then we can compute this limit by choosing an appropriate sequence $x_n\to a$. For each $n\geq 1$, there exists an $x_n\neq a$ such that $|x_n-a|<\frac{1}{n}$ and $f(x_n)=f(a)$. It follows that
\begin{eqnarray}
\lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{x-a}&=&\lim_{n\to \infty}\frac{g\circ f(x_n)-g\circ f(a)}{x_n-a}\\
&=& \lim_{n\to \infty}\frac{g\circ f(a)-g\circ f(a)}{x-a}\\
&=& 0.
\end{eqnarray}

This shows that if $f$ is a function for which the 'wrong proof' of the chain rule fails, then $(g\circ f)'(a)=0$. Off course, I was only able to show this under the assumption that $(g\circ f)'(a)$ actually exists (which off course is true as one can actually prove the chain rule). Nonetheless, this begs the question whether there is a more direct way of showing that $(g\circ f)'(a)$ actually exists (and equals zero) if $f$ is a function for which the 'wrong proof' fails. If so, one can actually fix this 'wrong proof' by considering two cases.

Best Answer

Actually, the "wrong proof" is not so bad, as the problem can only happen when $f'(a)=0$ . To fix the problem, it suffices to handle two cases: the case $f'(a)\neq 0$ (for which the "wrong proof" goes through) and the exceptional case $f'(a)=0.$ For completeness, one states the Chain Rule again.

Proposition. Let $f,g:{\mathbb R}\rightarrow {\mathbb R}$ be functions. If $a\in {\mathbb R}$ such that $f'(a)$ and $g'(f(a))$ both exist, then $$(g\circ f)'(a)=g'(f(a))f'(a).$$

Case 1. $f'(a)\neq 0.$

In this case, there exists $\delta>0$ such that $$f(x)-f(a)\neq 0$$ if $0<|x-a|<\delta.$ To see this, let $0<\epsilon<\frac{|f'(a)|}2$ be given. Then there exists $\delta>0$ such that $$0<|x-a|<\delta\Rightarrow \left|\frac{f(x)-f(a)}{x-a}-f'(a)\right|<\epsilon$$ $$\Rightarrow \left|\frac{f(x)-f(a)}{x-a}\right|>|f'(a)|-\epsilon>\frac{|f'(a)|}2>0$$ $$\Rightarrow |f(x)-f(a)|\neq 0,$$ as required. This means that as $x\rightarrow a$, the "wrong proof" works.

Case 2. $f'(a)=0.$

In this case, one needs to prove that $(g\circ f)'(a)=0.$ One considers two possibilities:

$$\frac{(g\circ f)(x)-(g\circ f)(a)}{x-a}=\left\{\begin{array}{cc}0&{\rm if~}f(x)=f(a)\\ \frac{g(f(x))-g(f(a))}{f(x)-f(a)}\cdot \frac{f(x)-f(a)}{x-a}&{\rm if~}f(x)\neq f(a).\end{array} \right.$$ It suffices to show that for every $\epsilon>0$, there exists $\delta>0$ such that $$0<|x-a|<\delta\Rightarrow \left|\frac{(g\circ f)(x)-(g\circ f(a)}{x-a}\right|<\epsilon.\qquad (1)$$ It is clear that when $x\rightarrow a$ and $x\neq a$, if $f(x)=f(a)$, the right hand side of (1) gives $0<\epsilon$, which trivially holds. So for the given $\epsilon>0$, one just needs to find $\delta>0$ to address the second possibility: $f(x)\neq f(a)$. Namely, one needs to show in this case that $$\left|\frac{(g\circ f)(x)-(g\circ f)(a)}{x-a}\right|=\left|\frac{g(f(x))-g(f(a))}{f(x)-f(a)}\right|\cdot \left|\frac{f(x)-f(a)}{x-a}\right|<\epsilon.\qquad (2)$$ This is more or less straightforward, but one spells out the details below.

Since $g'(f(a))$ exists (hence bounded), there exists $\epsilon_1>0$ and $M>0$ such that $$0<|f(x)-f(a)|<\epsilon_1\Rightarrow \left|\frac{g(f(x))-g(f(a))}{f(x)-f(a)}\right|\leq M.$$

Since $f'(a)=0,$ there exists $\delta_1>0$ such that $$0<|x-a|<\delta_1\Rightarrow \left|\frac{f(x)-f(a)}{x-a}\right|<\frac{\epsilon}M.$$

Since $f$ is continuous at $a,$ there exists $\delta_2$ such that $$|x-a|<\delta_2\Rightarrow |f(x)-f(a)|<\epsilon_1.$$

Now let $\delta:=\min(\delta_1,\delta_2).$ Then one sees that $$0<|x-a|<\delta\Rightarrow \left|\frac{g(f(x))-g(f(a))}{f(x)-f(a)}\right|\cdot \left|\frac{f(x)-f(a)}{x-a}\right|<M\cdot \frac{\epsilon}M=\epsilon,$$ provided that $f(x)-f(a)\neq 0,$ as required by (2). QED

Related Question