Multivariable Calculus – Derivative as a Linear Transformation

derivativeslinear-transformationsmultivariable-calculus

i can't understand this :

consider $f:A \rightarrow Y$ for normed space $Y$, $~~A$ is an open subset of $\mathbb R$ ,and $a \in A$.

In this case ,the existence of vector derivative $f'(a)$ is equivalent to fact that in the neighbourhood of $a$ ,$f(a+r)$ can be very well estimated by $f(a)+[f'(a)](r)$.

The meaning of 'very well estimated' is given precisely as:

for each $\epsilon \gt 0$ ,there is a $\delta \gt 0$ such that if $|r|<\delta$ ,then $\|f(a+r)-f(a)-[f'(a)](r) \| \leq \epsilon |r|.$

The notation $[f'(a)]$ is meant to be suggestive.Any vector y $\in Y$ corresponds to linear transformation $T_{y}:\mathbb R \rightarrow Y$ given by $T_{y}(r)=r$ y.Conversely ,any linear transformation $T$ in $L(\mathbb R,Y)$ corresponds to vector y=$T(1)$.

It turns out that thinking of $[f'(a)]$ as a linear transformation in $L(\mathbb R,Y)~$, rather than a vector , is the key that lets us generalize the definition to cases where the domain space $X$ is not $\mathbb R$.

What I can't understand is why we think of derivative as a linear transformation here ?Please help…..


EDIT:In particular I can't understand why does the book states:

Any vector y $\in Y$ corresponds to a linear transformation := $T_{y}:\mathbb R \rightarrow Y$ by$T_{y}(r)=r$ y.Conversely ,any linear transformation $T$ in $L(\mathbb R,Y)$ corresponds to vector y=$T(1)$.It turns out that thinking of $[f'(a)]$ as a linear transformation in $L(\mathbb R,Y)~$, rather than a vector , is the key that lets us generalize the definition to cases where the domain space $X$ is not $\mathbb R$.


Please If anyone could explain me this and while constructing the answer knowing that I just know fundamentals of calculus and analysis and don't know much vector spaces…

Best Answer

I think it may be easier for you to just think to the one-dimensional case at the beginning; so take a differentiable function $f \colon \mathbb{R} \to \mathbb{R}$ and assume that you want to differentiate it at point $0$. Let us also assume that $f(0) = 0$. If this is not true, you can translate your axes do that this becomes true.

You probably already know that $f'(0)$ is the slope of the straight line tangent to $f$ and passing through $(0, 0)$:

Derivative of a function at the origin

(the continuous line is the function $f$, the dashed line is the tangent line at the point $(0, 0)$)

This is a perfect legitimate way to think about a derivative. But there is another one, which is helpful to understand differentiation in more complex cases.

Instead of thinking as the derivative of a function in a point as a number, we want to think it as another function (note that when I say "function" I don't mean the derivative function $f'$ as a whole; I'm speaking of a derivative at a single point as a function by itself). In particular, I have the function $f$ near the point $0$ and I want to find another function which is linear and is the "best possible approximation" of $f$ near $0$. Which one is this function? It is the one that maps $$ x \mapsto f'(a) \cdot x . $$

This whole function is what your book calls $[f'(a)]$. Let me stress again that this is not just a number, it is a whole function. It is the dashed line in the picture above. So, while $f'(0)$ is just the slope of the dashed line, $[f'(0)]$ is the dashed line itself (or, more precisely, the function whose plot is the dashed line).

The whole point about differentiation is this one: taking functions that are not necessarily linear and "make them linear", with respect to one point of the domain. Why do we take derivatives? Because linear functions are much more beautiful and their behavior is much easier to understand (of course if the initial function $f$ is already linear, then we do not have to do much: its derivative, taken at any point, will be the function $f$ itself).

Now, this was the easy case: a function from $\mathbb{R}$ to $\mathbb{R}$. The point is that the same reasoning works in more complex situations. For example, if you have a non linear function $f \colon \mathbb{R}^n \to \mathbb{R}^m$, you can perfectly define its derivative at a certain point $a$ as the function $\mathbb{R}^n \to \mathbb{R}^m$ that is linear and best approximates $f$ near $a$. Of course you still have to understand whether it exists and it is unique and, even then, you may like to know how you are supposed to make computations with it, but at least for the definition this is the simplest way to think about a derivative.

About your question on correspondence between vectors and linear functions: if you are still not confident enough with linear algebra, do as before and think with just one dimension. This statement in one dimension reads:

There is a correspondence between real numbers and linear functions $\mathbb{R} \to \mathbb{R}$.

Which is it? If you have a real number $m$ you can make a function out of it just by taking the linear function $x \mapsto mx$. If you have a linear function, you just take its slope and you found a number. These two functions are inverse each of the other.

This is the same thing that we did above with differentiation: if you have a function and you know its derivative as a number, you can easily construct its derivative as a function in the way I suggested above. If you have the function, then the number is nothing else than its slope.

When you will know more about vector spaces (which I suggest to you to make it happen soon), you will see that everything that I just wrote works just the same in that case.

Why do we think at derivatives this way?

I would mention a couple of reasons.

The first one is practical: numbers are easy to work with as long as you have just one dimension. They become a bit more complicated in more dimensions, but still feasible. In more complicated contexts, though, they would become a real nightmare: on Riemannian manifolds you wouldn't know how to choose them; in infinite dimensions you would have infinitely many of them (and this would be the least of the technical complications). Instead, linear maps are something that is always pretty easy to define and usually behave the right way. In those contexts they it is much better to work with them.

The other reason is more theoretic and I already sketched it above. The point is: "Why do we take derivatives? What are they good for?". My view is that derivatives are mostly a way to make difficult things easier. Arbitrary smooth functions can be very complicated. Linear functions are instead very easy to understand. You can compose them, you can describe them easily, you know their behavior. So having a "magic wand" that takes a smooth function and turns it in a linear function that locally shares some of the features of the original function is really desirable (with reference to the picture: the wand takes the solid line and turns it into the dashed line). Differentiation is this magic wand.

Related Question