[Math] Why is gradient in the direction of ascent but not descent

intuitionmultivariable-calculusvector analysis

I understand that differentiation of a function ($\mathbb{R} \rightarrow \mathbb{R} $) at a point is the rate of change in the output for a slight nudge in the input. And this rate of change could be negative or positive. There is no concept of direction for the single-variable function as obvious.

Now, my doubt is in the case of the multivariate function ($\mathbb{R}^n \rightarrow \mathbb{R}$) where differentiation is a gradient. And this gradient representing partial differentiation w.r.t. to each basis becomes a direction. This direction is a direction of ascent but not descent, why?. Why it is a direction is of ascent. My question is not at all related to steepest ascent, about which one can find many answers on this forum and read elaborately at this link. An intuitive explanation would be preferable than mathematical at this link.

Best Answer

The comments persuaded me to reformulate my answer. For the original (still correct, but sub-optimal) version, see below.

The gradient is defined in a completely natural way. There is no completely mathematical reason, why it can be said to point to the steepest ascent. It has more to do with some more or less arbitrary choices being made in several definitions, which break this symmetry.

Observation. The concept "gradient points in direction of ascent" also works for single-valued functions $\Bbb R\to\Bbb R$. There is indeed a concept of direction in $\Bbb R$: right and left. A positive derivative is a vector (the gradient) pointing to the right (in the direction of ascent), a negative derivative is a vector pointing to the left (in this case, also the direction of ascent, because the function is decreasing to the right). So since the same observation also applies in 1D, we should start looking for an explanation here.

Note: I am going to use the terms "right" and "left" for the direction "positive" and "negative" on the number-line, because this is the standard orientation of the number-line. This is also a symmetry break, but only a notational one. It does not effect the mathematics in any way if we flip these directions.

Thanks to the comments, some few definitions could be localized as the rootcause of the broken symmetry. If you are standing on a mountain side, there is no meaning in asking whether it is up-hill or down-hill. This question only makes sense if you define a direction with respect to which we should judge the slope. The same goes for single-valued functions. It has been standard to call a function increasing if it's function graph is uphill to the right. This involves two arbitrary choices:

  1. The $y$-axis is pointing upwards, hence increasing function values are seen as going up. This is the most obvious arbitrary choice. Many applications do it the other way around, e.g. line-numbers in text are increasing from top to bottom, and pixels on a screen are usually adressed with an downwards-increasing $y$-axis.
  2. The kind of slope is judged w.r.t. to the "arbitrary" direction "right". Why not left? It seems natural, but is not forcing.

There might be another arbitrary choice: a positive derivative indicates that the function is increasing. We could have defined it the other way around. Anyways, flipping any single of these definitions will change the gradient from pointing upwards to pointing downwards.

Note. Yes I know, "increasing" is formally defined as $x\le y\implies f(x)\le f(y)$, but also this definition is motivated by the visualization of an increasing function graph to the right. No one would use it if the $y$-axis was pointing downwards.

Conclusion: The reason for the gradient pointing to the steepest ascent is based in our somewhat biased definitions. This is especially evident in the 1D-case. The derivative is defined in such a way so that it has a positive value (the gradient points to the right) if the function increases. A function is called increasing if its function graph is going uphill to the right. You see how these arbitrary definitions combine to "gradient is pointing uphill".


ORIGINAL

Because we have a somehow biased definition of differentiation. Let me explain.

As noted in a comment, this "gradient pointing in direction of ascent" also works for single-valued functions $\Bbb R\to\Bbb R$. There is indeed a concept of direction in $\Bbb R$: left and right. A positive derivative is a vector (the gradient) pointing to the right (in the direction of ascent), a negative derivative is a vector pointing to the left (also in the direction of ascent, because the function is decreasing). So since the same observation happens in $1$D, we should probably start there looking for an answer.

It all happens because the definition of derivative is biased in some sense. Someone once decided that a function is considered increasing if its value gets bigger to the right. You see the broken symmetry? Why to the right, why not to the left? So once one decided that the derivative is positive $-$ the gradient points to the right $-$ when the function grows to the right. Here we have it. Whoever defined it, directly coupled the terms "direction of gradient" and "direction of acsent".

Would he had decided to define a function as increasing if its value grows to the left (unnatural, considering our left-to-right reading direction), then the gradient would point to the steepest decent.

Note: In this answer I assumed that the number line is oriented with the positive number on the right. This is standard, but another symmetry break (but only a notational one, without impact on the mathematics). You can substitute all left/right above by negative/positive if you want to be indepedent of this broken symmetry.

Related Question