[Math] Help understanding Rudin’s proof of the chain rule


The first step of the proof of the chain rule in Rudin's Principles of Mathematical Analysis (Theorem 5.5, page 105) is as follows

Theorem. Suppose $f$ is continuous on $[a,b]$, $f'(x)$ exists at some point $x\in[a,b]$, $g$ is defined on an interval $I$ which contains the range of $f$, and $g$ is differentiable at the point $f(x)$. If $$h(t)=g(f(t))\quad (a\leq t\leq b)$$then $h$ is differentiable at $x$, and $$h'(x)=g'(f(x))f'(x)$$ Proof. Let $y=f(x)$. By the definition of the derivative, we have $$f(t)-f(x)=(t-x)[f'(x)+u(t)]$$ $$ g(s)-g(y)=(s-y)[g'(y)+v(s)]$$ where $t\in[a,b]$, $s\in I$, and $u(t)\rightarrow 0$ as $t \rightarrow x$, $v(s) \rightarrow 0$ as $s\rightarrow y$.


I think I can follow the rest from here, but I don't understand this manipulation. The definition of the derivative gives $$f'(x)=\lim_{t\rightarrow x} \frac{f(t)-f(x)}{t-x}$$ I can sort of see what's going on—it's a little like we're multiplying both sides of the equation by $t-x$ and $u(t)$ is there to make doing that make sense but I can't figure out how.

Best Answer

What Rudin really means is this: define $$u(t)=\cases{ \frac{f(t)-f(x)}{t-x}-f'(x) & if $t \ne x$, \\ 0 & if $t = x$. }$$ for $t$ near $x$. You can see that $u(t) \to 0$ as $t \to x$ by the definition of the derivative of $f$ at $x$. Clearly, $$f(t)-f(x)=(t-x)[f'(x)+u(t)]$$ as well.