[Math] Why does this proof of the chain rule not work

calculuslimitsproof-verification

Why is this proof not valid?

Here is my "rigorized" version: We write $$\frac{d}{dx}f\big(g(x)\big)=\lim_{b\to a}\frac{f\big(g(b)\big)-f\big(g(a)\big)}{b-a}=\lim_{b\to a}\left[\frac{f\big(g(b)\big)-f\big(g(a)\big)}{g(b)-g(a)}\cdot\frac{g(b)-g(a)}{b-a}\right]$$ Now we split the limit up to get $$\lim_{b\to a}\left[\frac{f\big(g(b)\big)-f\big(g(a)\big)}{g(b)-g(a)}\right]\cdot\lim_{b\to a}\left[\frac{g(b)-g(a)}{b-a}\right]$$ In the first limit, we can set $y=g(x)$. Then by differentiability, and hence continuity, of $g$, we have $y\to g(a)$ as $x\to a$. Therefore the first limit can be expressed as $$\lim_{y\to g(a)}\left[\frac{f\big(y\big)-f\big(g(a)\big)}{y-g(a)}\right]$$ So we get, by definition of the derivative, $$\frac{d}{dx}f\big(g(x)\big)=f'\big(g(x)\big)\,g'(x)$$

There are two objections given to this proof. The first is that one cannot multiply by the quantity $\big(g(b)-g(a)\big)/\big(g(b)-g(a)\big)$ as $g$ may be constant around $x=a$ and the expression would then be undefined. However, it seems obvious that we can simply consider the case of $g$ constant separately (and the case of $g'$ not defined due to 'infinite oscillations', the difficulty cited in the wiki article), where it is easily seen that the formula is valid. (Indeed, this exact point is made in the comments under the answer to this question.)

The second is that the limit substitution is not justified. I don't understand this either. That limit rule (in which such substitutions are allowed for continuous functions) could easily be proven, but that has nothing to do with the chain rule. This is the objection given in an answer to this question.

I found three other relevant sources.

  1. The first proof in the Wikipedia article explicitly avoids the proof above on the basis that the expression noted may be undefined.

  2. In the last page of this PDF the author offers students $3$ points if they can explain why the argument I gave above is a flawed proof.

  3. In this PDF the author similarly showcases the flawed proof, and then moves on to the 'real' proof.

The 'real' proofs are, shall we say, 'not pretty'. I am wondering if this salvaged version of the intuitive proof really does not work, and why?


EDIT: THEORETICAL ADDENDUM FOR THE CASE OF INFINITE OSCILLATIONS

Suppose that $g(a)=g(b)$ for infinitely many $b$ in all neighborhoods of $a$. Then I claim: if $g'(a)$ is defined, it must equal $0$.

Proof. The quantity $$\frac{g(x)-g(a)}{x-a}$$ can be made equal to $0$ by picking an appropriate $b$ in the interval $(a-\delta,a)\cup (a,a+\delta)$ regardless of how small $\delta>0$ is. Therefore there does not exist, for every $\epsilon>0$, a $\delta>0$ such that the difference quotient is always within $\epsilon$ of any limiting value not equal to $0$. For the limit $L\ne 0$, take $\epsilon=|L|/2$. As the limit must exist by hypothesis, it exists and equals $0$.

Next I claim: the chain rule is valid in this case.

Proof. The chain rule formula returns $f'(g(a))g'(a)=f'(g(a))\cdot 0=0$ in this case. We prove that, in fact, the derivative is equal to zero.

Because $g'(a)=0$, we can make $$|g(x)-g(a)|<\epsilon |x-a|$$ true for any $\epsilon>0$ by picking an appropriate $\delta>0$. Then suppose that $f'(g(a))=c$. We have $$|f(x)-f(g(a))|<\max\{|c+\epsilon|,|c-\epsilon|\}|x-g(a)|$$ for any $\epsilon>0$ when $x$ is sufficiently close to $g(a)$. So make $|x-a|$ sufficiently small for all conditions. Then we have $$|f(g(x))-f(g(a))|<\max\{|c+\epsilon|,|c-\epsilon|\}|g(x)-g(a)|$$ $$\frac{|f(g(x))-f(g(a))|}{|x-a|}<\max\{|c+\epsilon|,|c-\epsilon|\}\frac{|g(x)-g(a)|}{|x-a|}$$ But $$\frac{|g(x)-g(a)|}{|x-a|}<\epsilon^*$$ ($\epsilon^*$ the first epsilon) so we have $$\left|\frac{f(g(x))-f(g(a))}{x-a}\right|<\epsilon^*\max\{|c+\epsilon|,|c-\epsilon|\}$$ Therefore the difference quotient can be made arbitrarily small, and hence $$\frac{d}{dx}\left(f\big(g(x)\big)\right)=0$$ as was to be shown.

Best Answer

Instead of putting so many long comments I thought it would be better to make a small answer. We start with a point $a$ and let us keep notation $(f\circ g)(x) = f(g(x))$. Then we have $$(f\circ g)'(a) = \lim_{x \to a}\frac{f(g(x)) - f(g(a))}{x - a}$$ We are given that $$\lim_{y \to g(a)}\frac{f(y) - f(g(a))}{y - g(a)} = f'(g(a)) = A, \lim_{x \to a}\frac{g(x) - g(a)}{x - a} = g'(a) = B$$ We need to show that $(f\circ g)'(a) = AB$. Clearly we need to distinguish two cases:

1) There is a neighborhood of $a$ in which $g(x) \neq g(a)$ when $x \neq a$. In this case we have $$(f\circ g)'(a) = \lim_{x \to a}\frac{f(g(x)) - f(g(a))}{x - a} = \lim_{x \to a}\frac{f(g(x)) - f(g(a))}{g(x) - g(a)}\cdot\frac{g(x) - g(a)}{x - a} = AB$$

2) Every neighborhood of $a$ contains infinitely many points $x\neq a$ such that $g(x) = g(a)$. Hence it is possible to find a sequence $x_{n} \to a$ such that $x_{n} \neq a$ and $g(x_{n}) = g(a)$. Now limit $B$ exists and hence $$B = \lim_{x \to a}\frac{g(x) - g(a)}{x - a} = \lim_{n \to \infty}\frac{g(x_{n}) - g(a)}{x_{n} - a} = 0$$ Now we need to show that $(f\circ g)'(a) = AB = 0$. Consider the ratio $$F(x, a) = \frac{f(g(x)) - f(g(a))}{x - a}$$ if $g(x) = g(a)$ then $F(x, a) = 0$. If $g(x) \neq g(a)$ then we can write $$F(x, a) = \frac{f(g(x)) - f(g(a))}{g(x) - g(a)}\cdot\frac{g(x) - g(a)}{x - a}$$ and the first factor is near $A$ and second factor is near $B = 0$. So effectively if we have a sufficiently small neighborhood of $a$ then we either have $F(x, a) = 0$ or $F(x, a)$ is very small. Using $\epsilon, \delta$ argument we can show that for any $\epsilon > 0$ there is a $\delta > 0$ such that $|F(x, a)| < \epsilon$ whenever $0 < |x - a| < \delta$. This shows that $\lim_{x \to a}F(x, a) = 0$ i.e. $(f\circ g)'(a) = 0$ as was to shown.

Related Question