[Math] In what sense is the derivative the “best” linear approximation

derivativesnumerical methodsreal-analysis

I am familiar with the definition of the Frechet derivative and it's uniqueness if it exists. I would however like to know, how the derivative is the "best" linear approximation. What does this mean formally? The "best" on the entire domain is surely wrong, so it must mean the "best" on a small neighborhood of the point we are differentiating at, where this neighborhood becomes arbitrarily small? Why does the definition of the derivative formalize precisely this? Thank you in advance.

Best Answer

Say the graph of $L$ is a straight line and at one point $a$ we have $L(a)=f(a)$. And suppose $L$ is the tangent line to the graph of $f$ at $a$. Let $L_1$ be another function passing through $(a,f(a))$ whose graph is a straight line. Then there is some open interval $(a-\varepsilon,a+\varepsilon)$ such that for every $x$ in that interval, the value of $L(x)$ is closer to the value of $f(x)$ than is the value of $L_1(x)$. Now one might then have another line $L_2$ through that point whose slope is closer to that of the tangent line than is that of $L_1$, such that $L_2(x)$ actually comes closer to $f(x)$ than does $L(x)$, for some $x$ in that interval. But now there is a still smaller interval $(a-\varepsilon_2,a+\varepsilon_2)$, within which $L$ beats $L_2$. For every line except the tangent line, one can make the interval small enough so that the tangent line beats the other line within that interval. In general there's no one interval that works no matter how close the rival line gets. Rather, one must make the interval small enough in each case separately.

Related Question