How can the chain rule be explained more rigorously

calculuschain rulederivativeslimits

The 'proof' for the chain rule that is often used at school is unsatisfying for me because it treats derivatives as fractions:

$$
\frac{dy}{dx}=\frac{dy}{du}\times\frac{du}{dx}
$$

However, the more rigorous proofs that are used in University are unfathomable to me because they are intended for people with a much greater level of background knowledge. Is there a way I can think of the chain rule (perhaps not a rigorous proof) that acknowledges that derivatives are a shorthand for limit expressions, but does not use esoteric notation or complicated methods?

For reference, here is a list of things that I do and don't know:

  • I know how to differentiate from first principles using $$f'(x)=\lim_{h\to0}\frac{f(x+h)-f(x)}{h}$$
  • Apart from the chain rule, I know the product rule and the quotient rule (but again, I don't know the proofs for these rules)
  • I know some limit laws (e.g. the quotient law for limits)
  • I don't have a rigorous understanding of limits, but I think I have a good intuitive grasp of them
  • Similarly, I have an intuitive understanding of continuous vs. discontinuous functions (continuous = not lifting your pen off the page), but I have not been taught the formal definition for continuity

Thank you for reading.

Best Answer

Nobody should use the "fraction" approach.

To provide intuition I tend to fall back on linear approximation.

If we write $$f(x+\epsilon)\approx f(x)+ f'(x)\epsilon$$

Then $$f\circ g(x+\epsilon)\approx f\circ g(x)+(f\circ g)'(x)\epsilon$$

But we could also write $$f\circ g(x+\epsilon)\approx f(g(x)+ g'(x)\epsilon)\approx f\circ g (x)+f'(g(x))g'(x)\epsilon$$

And comparing the two shows that $$(f\circ g)'(x)=f'(g(x))g'(x)$$ as desired.

Of course, to make this rigorous one has to argue that the coefficient in the linear approximation is uniquely defined and so on, but students ought to be aware that this interpretation of derivatives is an important tool in numerical analysis and the chain rule drops out of it.