[Math] Why does the generalised derivative have to be a linear transformation

derivativesintuitionmotivationmultivariable-calculusreal-analysis

I am starting to learn Real Analysis and I have come across the generalised definition of the derivative for higher dimensions. I realise that the derivative being a linear transformation nicely accommodates the one dimensional case where the derivative is just a constant at any point. I also understand it can't be as simple as multiplication by a constant for higher dimensions since you can approach a point along multiple curves in higher dimensions. But where did we hit upon the fact that it has to be linear? Why couldn't it be some other type of function? I would like to get an intuitive explanation.

Best Answer

It does not have to be - we want it to be so.

It is a definition that the derivative is a linear map. So the question is more "Why is the notion of linear approximation so interesting, that it deserves such a central place?". The answer to this is, that linear maps are fairly simple to understand while they still are fairly general.

If you choose simpler approximations, e.g. you only allow maps of the form $x\mapsto \lambda*x$ for some scalar $\lambda$ as "derivatives", many functions would not be "differetiable" anymore.

If you choose more complicated maps, e.g. you allow for maps like $x\mapsto Ax + B|x|$ with a componentwise absolute value (so the "derivative" would be a pair $(A,B)$), you will have some more functions "differentiable" but it is far from clear how this notion will be of any help.

So, linear maps seem to be a perfect balance between simplicity and generality. You see this in action, e.g. if you see Newton's method in higher dimension in action or analyze non-linear systems of differential equations by means of their local linearizations.

(Another aspect: For functions of a complex variable there are two notions of differentiability. You can consider real linearity which gives differentiability in the sense of mappings from two dimensional real space to itself. The other possibility is to consider complex linearity and this leads to holomorphic functions. This gives a lot of extra rigidity and leads to more restrictive but also powerful notion of derivative.)

Related Question