Actually, the "wrong proof" is not so bad, as the problem can only happen when $f'(a)=0$ . To fix the problem, it suffices to handle two cases: the case $f'(a)\neq 0$ (for which the "wrong proof" goes through) and the exceptional case $f'(a)=0.$ For completeness, one states the Chain Rule again.
Proposition. Let $f,g:{\mathbb R}\rightarrow {\mathbb R}$ be functions. If $a\in {\mathbb R}$ such that $f'(a)$ and $g'(f(a))$ both exist, then $$(g\circ f)'(a)=g'(f(a))f'(a).$$
Case 1. $f'(a)\neq 0.$
In this case, there exists $\delta>0$ such that $$f(x)-f(a)\neq 0$$ if $0<|x-a|<\delta.$ To see this, let $0<\epsilon<\frac{|f'(a)|}2$ be given. Then there exists $\delta>0$ such that $$0<|x-a|<\delta\Rightarrow \left|\frac{f(x)-f(a)}{x-a}-f'(a)\right|<\epsilon$$ $$\Rightarrow \left|\frac{f(x)-f(a)}{x-a}\right|>|f'(a)|-\epsilon>\frac{|f'(a)|}2>0$$ $$\Rightarrow |f(x)-f(a)|\neq 0,$$ as required. This means that as $x\rightarrow a$, the "wrong proof" works.
Case 2. $f'(a)=0.$
In this case, one needs to prove that $(g\circ f)'(a)=0.$ One considers two possibilities:
$$\frac{(g\circ f)(x)-(g\circ f)(a)}{x-a}=\left\{\begin{array}{cc}0&{\rm if~}f(x)=f(a)\\
\frac{g(f(x))-g(f(a))}{f(x)-f(a)}\cdot \frac{f(x)-f(a)}{x-a}&{\rm if~}f(x)\neq f(a).\end{array} \right.$$ It suffices to show that for every $\epsilon>0$, there exists $\delta>0$ such that $$0<|x-a|<\delta\Rightarrow \left|\frac{(g\circ f)(x)-(g\circ f(a)}{x-a}\right|<\epsilon.\qquad (1)$$ It is clear that when $x\rightarrow a$ and $x\neq a$, if $f(x)=f(a)$, the right hand side of (1) gives $0<\epsilon$, which trivially holds. So for the given $\epsilon>0$, one just needs to find $\delta>0$ to address the second possibility: $f(x)\neq f(a)$. Namely, one needs to show in this case that $$\left|\frac{(g\circ f)(x)-(g\circ f)(a)}{x-a}\right|=\left|\frac{g(f(x))-g(f(a))}{f(x)-f(a)}\right|\cdot \left|\frac{f(x)-f(a)}{x-a}\right|<\epsilon.\qquad (2)$$ This is more or less straightforward, but one spells out the details below.
Since $g'(f(a))$ exists (hence bounded), there exists $\epsilon_1>0$ and $M>0$ such that $$0<|f(x)-f(a)|<\epsilon_1\Rightarrow \left|\frac{g(f(x))-g(f(a))}{f(x)-f(a)}\right|\leq M.$$
Since $f'(a)=0,$ there exists $\delta_1>0$ such that $$0<|x-a|<\delta_1\Rightarrow \left|\frac{f(x)-f(a)}{x-a}\right|<\frac{\epsilon}M.$$
Since $f$ is continuous at $a,$ there exists $\delta_2$ such that $$|x-a|<\delta_2\Rightarrow |f(x)-f(a)|<\epsilon_1.$$
Now let $\delta:=\min(\delta_1,\delta_2).$ Then one sees that $$0<|x-a|<\delta\Rightarrow \left|\frac{g(f(x))-g(f(a))}{f(x)-f(a)}\right|\cdot \left|\frac{f(x)-f(a)}{x-a}\right|<M\cdot \frac{\epsilon}M=\epsilon,$$ provided that $f(x)-f(a)\neq 0,$ as required by (2). QED
Yes that seems to be a careless typo in the text you flagged.
I believe that the conditions imposed on $r(h)$ in Proposition 8.11 ought to have stated explicitly that $r(h)$ is continuous as $h\to 0$. Then it is evident that $r(0)$ must be zero. The same applies to the expression $s(k)$.
There is a much cleaner conceptual approach to the definition of the derivative that was developed by Caratheodory about 100 years ago (based upon the simpler concept of continuity) . See e.g Question about Caratheodory's approach and the reference to Bernstein' notes cited therein. The Chain Rule can be proved in one line using this approach.
A nice expository article explores this in detail:
The Derivative á la Carathéodory
Stephen Kuhn
The American Mathematical Monthly, Vol. 98, No. 1 (Jan., 1991), pp. 40-44 (5 pages)
https://www.jstor.org/stable/2324035
Quoting from its introduction:
But there is another, less well known, characterization of the
derivative which appears in the last textbook [2] written by
Constantin Caratheodory (1873-1950). This formulation is not only
elegant but useful on both theoretical and pedagogical grounds. The
proofs of many important theorems concerning differentiability become
significantly easier and, in the process, some of the less
enlightening details are rightly submerged. Caratheodory's formulation
also gives us a much clearer view of the fact that continuity is
essential for differentiability; indeed, the definition itself
contains the necessary continuity. This formulation, which shifts the
details from the theory of limits to the theory of continuous
functions, requires that our students develop a clearer understanding
of continuity than is typical and it demands that we reevaluate this
understanding and continually reinforce it. We believe that
Caratheodory's insight deserves to be better known and we hope that
the present article will help in that effort.
- The multi-variable extension of Caratheodory's approach is explained in this article:
On the Differentiability of Functions of Several Variables
Michael W. Botsko, Richard A. Gosser
The American Mathematical Monthly, Vol. 92, No. 9 (Nov., 1985), pp. 663-665 (3 pages)
https://www.jstor.org/stable/2323717
Best Answer
Start with a reformulation of differentiability avoiding quotients:
A function $f: I\to\mathbb R$ on a set $I\subseteq \mathbb R$ is differentiable at $a\in I$ if and only if there is $\varphi:I\to \mathbb R$ which is continuous at $a$ and satisfies $f(x)-f(a)=\varphi(x)(x-a)$. Then $\varphi(a)=f'(a)$.
If $f$ is differentiable at $a$ with correspondig function $\varphi$ and $g:f(I)\to\mathbb R$ is differentiable at $f(a)$ with corresponding function $\gamma$, we get $$g(f(x))-g(f(a))=\gamma(f(x)) (f(x)-f(a))=\gamma(f(x))\varphi(x)(x-a).$$ Since compostions and products of continuous functions are continuous we get that $g\circ f$ is differentiable at $a$ with corresponding function $\gamma(f(x))\varphi(x)$ whose value at $a$ is $g'(f(a))f'(a)$.