Calculus – Why Any Directional Derivative is Recoverable from the Gradient

calculusmultivariable-calculuspartial derivative

I understand what partial derivatives, directional derivatives, and the gradient are. I can even follow symbolically from their definitions why:

$$D_\mathbf{u} f = \nabla f \cdot \mathbf{u}$$

But nonetheless I find it surprising that knowing the derivative in just 3 directions at a point (the gradient) is sufficient to figure out what it must be in any direction. The equation states it must be the case that the derivative in an arbitrary direction must be a weighted combination (dot product) the gradient with the weights determined by how x-axis'y and how y-axis'y the direction u is. What prevents me from constructing a function where this isn't true? Why can't a simultaneous increase in x and y give a dramatically different result than either alone? (e.g. a function that rises in the x+ direction and the y+ direction, but falls dramatically along the diagonal?)

Best Answer

Why can't a simultaneous increase in x and y give a dramatically different result than either alone? (e.g. a function that rises in the x+ direction and the y+ direction, but falls dramatically along the diagonal?)

Well, it can, but then the function won't be differentiable. One concrete example of a function that has different behavior in the $x$ and $y$ axes then it has in between is the function $z = r\sin(2 \theta)$, in cylindrical coordinates. This function is not differentiable at the origin. It is continuous at the origin and has slopes of $0$ in the $x$ and $y$ directions there - the $x$ and $y$ axes are both contained in the graph of the function. But in other directions the slopes at the origin can be anything else between $1$ and $-1$.

Remember that a point and two slopes in non-parallel directions are all that we need to completely determine a plane. So, if the tangent plane to the graph of $f(x,y)$ is well defined at a point, the slopes of the tangent plane in the $x^+$ and $y^+$ directions completely characterize the plane. A plane, being flat, can't increase along the $x^+$ and $y^+$ axes and decrease in between.

If a function tried to do that, it would not be differentiable at the point in question - it would not be well approximated by the plane that the gradient determines. This is the source of the definition of differentiability: a differentiable function has its slope in each direction determined by that direction and the slopes in the $x^+$ and $y^+$ directions.

The same thing happens in one dimension, we just get too used to it to see it. You might ask, "why does the behavior of a function in the $x^+$ direction determine the behavior in the $x^-$ direction? Why can't a function rise in both the $x^+$ and $x^-$ directions?". Of course, a function can do that, like $y = |x|$ does. But then the function will not be differentiable at the point in question, because it will not be well approximated by the line that is determined by the rate of change in the $x^+$ direction.

The situation in two or more variables is no different. In one dimension, the slope in the $x^+$ direction determines a line. In two dimensions, the slopes in the $x^+$ and $y^+$ directions determine a plane. In either case, we define the function to be differentiable if, around the point we started with, the function is well approximated by that line or plane in every direction that we can go, given the number of dimensions we are working with.