[Math] Difference between Gradient Descent method and Steepest Descent

convex optimizationgradient descentnumerical optimizationoptimizationvector analysis

What is the difference between Gradient Descent method and Steepest Descent methods?

In this book, they have come under different sections:

http://stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf

According to page 480, Gradient Descent is:

$$\Delta x=-\nabla f(x)$$

While page 490, for Steepest descent says:

$$\Delta x_{sd}=||\nabla f(x)||_*\Delta x_{nsd}$$
$$\Delta x_{nsd} = \text{argmin}\{\nabla f(x)^Tv~|~~~ ||v||\leq 1\}$$

I cannot understand their difference. How they are mathematically and geometrically different?

Best Answer

I am reading this book too, this is also a problem for me for a long time. The direction of gradient descent method is negative gradient. However the direction of steepest descent method is the direction such that

$Δx_{\text{nsd}}=\text{argmin}\{∇f(x)^Tv \quad| \quad ||v||≤1\}$

which is negative gradient only if the norm is euclidean. If the norm is other quadratic or l1norm, the result are not negative gradient.

The direction is -inv(P)*∇f(x), if the norm is quadratic norm.

Related Question