The purpose or proof behind chain rule

calculusderivativessoft-question

For example, take a function $\sin x$. The derivative of this function is $\cos x$.

The chain rule states that $\frac{d}{dx} (f(g(x)))$ is $\frac{d}{dx} g(x) \frac{d}{dx} (f(g(x)))$. Again going back to the example above, now instead of $\sin x$ lets take $\sin 2x$.

Differentiating it without chain rule, we get $\cos 2x$. However, using chain rule, we get $2\cos 2x$.

So now the problem is that I don't see the purpose behind the chain rule. Why should $\sin 2x$ be $2\cos 2x$?

Is there any proof behind this chain rule? I really need to know as I getting many questions wromg without using the chain rule.

Best Answer

This is a good question in my opinion. WHY is the chain rule right? My quick answer is that you are using the chain rule already without knowing it in the product rule, power rule, ect: $$ \frac{d}{dx}x^n = nx^{n-1}\cdot \frac{d}{dx}x = nx^{n-1} $$ So when you differentiate $\sin x$ you are actually doing $\cos x \cdot x' = \cos x$. For a more detailed answer, lets look at the definition of the derivative.

$$ F'(x) = \lim_{y\rightarrow x}\frac{F(x)-F(y)}{x-y} $$ so let $F(x) = f(g(x))$ and what do we get? $$ F'(x) = \lim_{y\rightarrow x}\frac{f(g(x)) - f(g(y))}{x-y} $$ which we can't evaluate. Let us assume that $g(x) \ne g(y)$ when $x$ is 'close' to $y$, then we can multiply the whole thing by 1 to get the product of two derivatives: $$ F'(x) = \lim_{y\rightarrow x}\frac{f(g(x)) - f(g(y))}{g(x)-g(y)}\cdot \lim_{y\rightarrow x} \frac{g(x)-g(y)}{x-y} = f'(g(x))g'(x) $$ where if we want to be picky we can consider $g(x)=g(y)$ too.

(What follows is quite informal) The chain rule actually says something fundamental about composition. We can think of the function $g(x)$ as 'stretching' or 'shrinking' the domain of $f$. When we differentiate we are differentiating with respect to $f$ under an 'unstretched' domain and must correct for our error by multiplying by the derivative of $g$ which is a measure of how severely the domain was stretched. This is why the power rule ect. do not seem to use the chain rule, the domain is unstretched, so our derivative doesn't need to be corrected at all!

For your example of $\sin 2x$ lets think about what is going on, we are essentially squeezing $\sin x$ in the $x$ direction. But this will make the slope of the sine function increase in a predictable way, in fact the slope at every point of this squeezed graph is twice as big as the original sine graph, exactly as predicted by the chain rule!

For more complicated $g(x)$ the chain rule measures the rate at which the domain is changing from $x$ at every point to make the derivative correct.

Related Question