[Math] Gradient, towards a maximum or minimum

multivariable-calculus

I have a question on the gradient of a mutlivariable function.

For a function $f: \mathbb{R}^{n} \to \mathbb{R}$, the gradient is given as
$$ [\frac{\partial f}{\partial x_1} \cdots \frac{\partial f}{\partial x_n}]$$

From the discussion I have read, I agree with the notion that the gradient is exactly the direction of steepest ascent.

To motivate my question, I consider a gradient descent algorithm.
The gradient descent algorithm aims to pick parameters $\vec{\theta} = [\theta_1, \cdots, \theta_n]$ such that the following cost function is minimized.
$$J(\theta) = \sum_{i=1}^{m} (h_{\theta}(x_i) – y_i)^2$$
where $h_{\theta}$ is some function parametrized by $\theta$.

The algorithm works by updating $\theta_{j}$ to $\theta_{j}'$ for all $j$ using the following update rule.
$$ \theta_{j}' = \theta_{j} – \alpha \frac{\partial J(\theta)}{\partial \theta_{j}}$$
for some constant $\alpha$.

By the definition of gradient, our algorithm gets closer and closer to the minimum each time. I get that the gradient is the direction for the greatest rate of change in the function (this comes from the cosine argument linked below), but that could mean both increase or decrease? Why is it that the slope of the gradient is always pointing to a local maximum and not a minimum?

Why is gradient the direction of steepest ascent?

Best Answer

The gradient towards to the maximum, because of its definition. Look at each component $i$ of gradient. You have a derivative $\frac{\partial f}{\partial x_i}$, whose sign indicates the direction of the increase of the function.