[Math] Multivariable derivative: Limit definition

calculusmultivariable-calculus

Consider a path through a domain in $\mathbb{R}^2$ given by $\mathbf{c}(t) = (x(t), y(t))$. We wish to find the rate of change of a function $f(x,y)$ along this path. Therefore, we wish to compute $\frac{d}{dt}f(\mathbf{c}(t))$. My question is about the limit definition. My book gives the limit definition of the derivative as

$$
\frac{d}{dt}f(\mathbf{c}(t)) = \lim_{h\to\ 0}\frac{f(x(t+h),y(t+h)) – f(x(t),y(t))}{h}\tag{1}$$

Question

Why isn't the derivative written as:
$$\frac{d}{d\mathbf{c}'(t)}f(\mathbf{c}(t)) = \lim_{h\to\ 0}\frac{f(x(t+h),y(t+h)) – f(x(t),y(t))}{\sqrt{(x(t+h)-x(t))^2 + (y(t+h) – y(t))^2}}\tag{2}$$
$$\frac{d}{d\mathbf{c}'(t)}f(\mathbf{c}(t)) = \lim_{\Delta x,\Delta y\to\ 0}\frac{f(x(t) + \Delta x,y(t) + \Delta y) – f(x(t),y(t))}{\sqrt{\Delta x^2 + \Delta y^2}}$$
$$\Delta x = x(t+h) – x(t) \\
\Delta y = y(t+h) – y(t)$$

where $d/d\mathbf{c}'(t)$ indicates the derivative in the direction of the tangent vector of the path. After reading the comments and responses under this question, my answer to my own question (with their help) is that these two limits speak to different derivatives. The first limit indicates the rate of change of $f$ as the parameter $t$ is changed. It indicates change in $f$ per unit $t$ ($t$ usually stands for time, but as we'll see below, it's just an arbitrary parameter. $t$ doesn't have to be 'time'). This derivative will depend on how you parameterize your path. A path can be parameterized in infinitely many ways ($\mathbf{c}_2(t)$ might move along your path twice as fast as $\mathbf{c}_1(t)$ for instance).

On the other hand, the second limit is simply the derivative of $f(x,y) = f(\mathbf{c}(t))$. It's just the derivative of the outside function with respect to the inside variables (instead of the derivative of the outside function with respect to the inside-inside variable – remember that if $f(x) = f(g(t))$, $f$ can either change directly through $x$, or indirectly through $t$. Because each case changes the function either by changing the $x$ or $t$ knob, we can ask for either $d/dx$ or $d/dt$). But back to equation $\textbf{(2)}$, since the inside variables $(x,y)$ form a path $\mathbf{c}(t)$, I don't write $\partial/\partial x$ or $\partial /\partial y$, but $d/d\mathbf{c}'(t)$ since $x,y$ are confined to change along a path. As you take the limit, you see that the derivative is in a direction tangent to your path, which is why I use $d/d\mathbf{c}'(t)$ since $\mathbf{c}'(t)$ is the tangent vector to the path. Again, the derivative represents the rate of change with respect to tangent lines to this path (you can see the direction that this derivative is taken in at a given $t$ by looking at the right hand side of equation $\textbf{(2)}$). This derivative indicates change in $f$ per unit change in tangent direction. This derivative will depend on your parameterization.

Comparisons to the 1-variable case: The first limit (1)

Let's discuss the first limit (the derivative of $f$ with respect to the 'inside-inside' variable. How does $f$ change indirectly through $t$). Is there a 1D analog? Yes. Consider the composite function $f(x) = f(x(t))$. What's the derivative of $f$ with respect to $t$? We can write the limit definition:

$$\frac{df(x(t))}{dt} = \lim_{h\to\ 0}\frac{f(x(t+h))-f(x(t))}{h}\tag{3}$$
This is indeed the 1D version of the first limit above $\textbf{(1)}$. To further drive the comparison, we know that $\frac{df}{dt} = f'(x(t))x'(t) = (df/dx)(dx/dt)$ = derivative of the outside times derivative of the inside. And in the multivariate case, that first limit can be shown to equal $\nabla f \cdot \mathbf{c}'(t)$ = derivative of the outside times derivative of the inside and do a sum for each variable. Indeed, the multivariate chain rule is in fact the generalization of the 1D chain rule. $f(x(t))$ is a composite function. But so to is $f(x(t),y(t)) = f(\mathbf{c}(t))$, just in a multivariate way. $\mathbf{c}(t)$ parameterizes how you move along the $x$ and $y$ axes. In the single variable case $f(x(t))$, we are parameterizing how we move along the only axis that $f$ has access to, which is the $x$ axis ($x = x(t)$). We usually see composite functions as $f(g(x)) = f(u)$, in which case we are parameterizing how $f$ moves along the $u$ axis with parameterization $g(x)$ (I'm just using different variables to show that a parameter is a parameter – It doesn't have to be time). Again this derivative of the 'inside-inside' variable will depend on your parameterization, which we can clearly see by looking at the 1D case (just take simple examples and see how $x'(t)$ changes). For instance, in 1D the only path you have to parameterize is the real line. That's the only path (let the path be the whole real line). Therefore different parameterizations will have different $x'(t)$. That is, different parameterizations will have different 'speeds' (again note that if $x$ is parameterized by $x(t)$, $x' = x'(t)$ is only a 'speed' if $x$ has units of meters and $t$ units of time. By 'speed' of the parameterization, I just mean derivative. To make this clearer, let $f(u) = f(g(k))$. The speed of the parameterization is $du/dk = g'(k)$). The change in $f$ per unit parameter will depend on how your path is parameterized. If $f$ is in temperature units Kelvin, $df(x(t))/dt$ could be 2 Kelvin per second for one parameterization and 3 Kelvin per second for another.

Comparisons to the 1-variable case: The second limit (2)

Let's look at the 1-variable case $f(u) = f(g(t))$. What is $df/du$? This is just the derivative of the outside function with respect to the inside function (it's how $f$ changes directly through it's domain variables as opposed to indirectly through the parameter). It's limit definition is given by

$$\frac{df(u)}{du} = \lim_{k\to\ 0}\frac{f(g(t) + k) – f(g(t))}{k}$$
which setting $k = g(t+h) – g(t)$ gives

$$\frac{df(u)}{du} = \lim_{h\to\ 0}\frac{f(g(t+h)) – f(g(t))}{g(t+h) – g(t)}\tag{4}$$
Either way hopefully you can get to this line without going through the first. All you're doing is taking the function at two different values and dividing by the difference. This is the 1D analog of the second limit $\textbf{(2)}$. The main difference is that in 1D, I can only move along my parameterization, which here is the $u$ axis. Therefore I write $d/du$. In the multivariate case, again I can only move along my parameterization. However, as we take a small step size $\Delta x$ and $\Delta y$ allowed by my parameterization that we let go to zero (by looking at $\textbf{(2)}$ hopefully you can see that $h \to 0$ is equivalent to $\Delta x \to 0$ or $\Delta y \to 0$), we are finding the rate of change of the function $f$ in the tangent direction to the path (think of two points on your path $\mathbf{c}(t)$. Fix one and allow the other to approach it. The two points form a tangent line in the limit). Therefore, I use $d/d\mathbf{c}'(t)$ to denote direction of the derivative. It's always a direction tangent to the path and that tangent direction changes over the path. In 1D, the direction tangent to the path was just the path itself (the axis), for the whole path. This derivative $\textbf{(2)}$ and $\textbf{(4)}$) will depend on your parameterization, which we can see by taking simple examples and looking at the 1D case $\textbf{(4)}$. Equations $\textbf{(2)}$ and $\textbf{(4)}$ give the rate of change of $f$ along the tangent direction to the path (which depends on your parameterization). Equations $\textbf{(1)}$ and $\textbf{(3)}$ give the rate of change of $f$ with respect to the parameter (which also depends on your parameterization).

Comments/Complaints on the Directional Derivative

A pure directional derivative is given by equation $\textbf{(2)}$ where $\mathbf{c}(t)$ is a line. That's it, end of story. For a vector $\vec{v} = (v_x, v_y)$, a parameterization along this direction is $x(t) = x + v_xt$ and $y(t) = y + v_yt$. Therefore, the directional derivative, given by equation $\textbf{(2)}$ is:

$$\frac{df(x(t),y(t))}{d\vec{v}} = \lim_{h\to 0} = \frac{f(x(t) + v_xh, y(t) + v_yh) – f(x(t), y(t))}{h}$$
$\textbf{IF}$ $\vec{v}$ is a unit vector so that $\sqrt{v_x^2 + v_y^2}$ = 1. However, textbooks will sometimes define this as the directional derivative even if the vector $\vec{v}$ is not a unit vector. I'm not a fan of this because that's not what equation $\textbf{(2)}$ says to do (and I find that $\textbf{(2)}$ makes sense). Essentially what they've done is $\textbf{redefine}$ what it means to be a unit vector. This is fine, but now we take $\vec{v}$ whatever it is, to be the unit vector standard. Therefore, their derivative which is still a rate of change of $f$ per unit distance, has a different meaning of 'per unit distance' then my definition of per unit distance based on my idea of a unit vector. The reason why they do this is because they understand what they are doing (redefining what a unit vector means – which is fine – but for 1st time learners, I think it can be confusing if you don't show that the directional derivative is essentially equation $\textbf{(2)}$.)

Best Answer

There is this guy flying over the $(x,y)$-plane, and he is continuously measuring the outside temperature $f(x,y)$. A standard apparatus with a rotating drum, as it is widely used in labs and zoos, would plot the measured temperature against time, i.e., would produce an ink plot of the function $$t\mapsto T(t)=f\bigl({\bf c}(t)\bigr)\ .$$ The derivative of this function is given in your first formula, and is the slope of the ink plot at the point $\bigl(t,T(t)\bigr)$. The resulting slope value not only depends on the point ${\bf c}(t)$ and the temperature conditions there, but also on the momentaneous tachometer speed of the aircraft.

On the other hand, the temperature curve $s\mapsto T(s)$ you are envisaging in your second approach does not depend on the tachometer speed of the aircraft, but only on its compass direction. In other words: It is the curve felt in an aircraft flying with constant speed $1$ along the same route.

Related Question