[Math] Understanding higher dimensional derivatives

derivativesreal-analysis

I'm having trouble understanding higher dimensional derivatives.

Suppose $f: \Bbb R \to \Bbb R$. We say $f$ is differentiable at $x = c$ if $\lim \limits_{x \to c} \dfrac{f(x) – f(c)}{x – c}$ exists. If the limit exists, we define $f'(c)$ as the value limit. Since this limit is a number, we can equivalently define the derivative of $f(x)$ at $c$ as the number $q$ such that $\lim \limits_{x \to c} \dfrac{f(x) – f(c) – q(x – c)}{x – c} = 0$. If such a $q$ exists, we define $f'(c)$ to be this $q$. We can think of $q$ as a "linear transformation" from $\Bbb R \to \Bbb R$. Why is it useful to think of $q$ in this way? What's the point of thinking about it as a linear transformation? I think it probably has something to do with the fact that $f(c) + q(x – c)$ is the tangent line.

Now, we can define the derivative of $g: \Bbb R^{n} \to \Bbb R^{m}$ in the same way, that is, if there is some linear transformation $T_{g}$ such that $\lim \limits_{x \to c} \dfrac{||g(x) – g(c) – T_{g}(x – c) ||}{||x – c||} = 0$, then we say $T_{g}$ is the derivative of $g$. Again, what's the point of saying this is a linear transformation? I already know it has to be an $m \times n$ matrix based on the context, but I don't see why we care that it is a linear transformation. Does it have something to do with the tangent plane?

Best Answer

If we start out with $f:\mathbb{R}\to\mathbb{R}$, then $f'(c)$ is an approximation of how $f$ changes in a small interval around $x=c$. For example, let $f(x)=x^3$, and $c=2$. Then $f'(2)=12$. Notice that $f(2.01)=8.120601$. Then the change from $f(2)$ to $f(2.01)$ is $0.120601$. This is approximately $12(.01)$.

For higher dimensions, the derivative needs to be a transformation between $\mathbb{R}^n$ and $\mathbb{R}^m$. For example, take $f(x,y)=(x+y,y^2)$. Then the derivative is $$ \begin{pmatrix} 1 & 1 \\ 0 & 2y \end{pmatrix}. $$ At $(x,y)=(2,1)$ this is $$ \begin{pmatrix} 1 & 1 \\ 0 & 2 \end{pmatrix}. $$ Moving on, $f(2,1)=(3,1)$ and $f(2.01,1.01)=(3.02,1.0201)$. The change between the function values is $(0.02,0.0201)$. Notice that $$ \begin{pmatrix} 1 & 1 \\ 0 & 2 \end{pmatrix} \begin{pmatrix} 0.01 \\ 0.01 \end{pmatrix} = \begin{pmatrix} 0.02 \\ 0.02 \end{pmatrix}. $$ Again, very close. So, the derivative is the linear transformation that most closely fits the function. Since linear transformations are much easier to study than functions in general, we may learn a lot about the function from its derivatives.

Related Question