[Math] Is Gradient really the direction of steepest ascent

multivariable-calculus

I want to intuitively understand why the gradient gives you the direction of the steepest ascent of a function.

Apart from the already posted questions, my confusion arises from the fact that we form the gradient vector from the derivative of each dimension separately. Then take the vector consisting of both (for 2D) derivatives take it as the steepest ascent.

What if in both directions the derivative is say $5$, so our vector will be $45$ degrees from both axis, But in that direction specifically the function goes down ?

If it's not clear what I'm confused with, consider this function represented as an image :

$$ \begin{pmatrix}100&5&-100\\0&\textit{0}&5\\0& 0& 0\end{pmatrix}$$

at 0, it makes sense that the derivative is $5$ in $x$ and in $y$, but a vector of $(5,5)$ goes to a direction that's not a steepest ascent. Does this have to do with the differentiability of the function ? what am I missing ?

Best Answer

What will help your intuition the most is remembering that the derivative (the gradient) is a local feature, it only depends on what the function is at that point, and not any distance away.

You may be visualizing a function which buckles down in the gradient direction, so it's not the steepest ascent some distance away -- but at the point where you find the tangent plane it is the steepest ascent for at least a very small distance.

At a point where a function is differentiable, the function is almost planar in a very, very small region around that point. Remember to visualize the local region as nearly a plane, and your intuition will be happier with the gradient.

Related Question