[Math] Derive steepest descent vector subject to norm constraint

I am currently working through an old textbook Practical Optimization by Gill, Murray and Wright (c 1982) who make some derivations which seem correct, but I am unable to duplicate. In the equations below $g$ and $p$ are both vectors. $g$ is the gradient from the current iterate of a optimization procedure and $p$ is the search direction we wish to solve for.

The authors inform the reader that the minimum (over $p$) of

$\Large\frac{g'p}{||p||}$ is $p=-g$.

Alternatively, we might form a different norm for $p$ by considering a symmetric positive definite matrix $C$ in which case the minimizer of

$\Large\frac{g'p}{(p'Cp)^{1/2}}$
is
$p=-C^{-1}g$.

I am having trouble deriving these statements.

My question is: Does it take more than a derivative and setting equal to zero to prove this? If so, how do I derive these formulas?

This question is similar to

Gradient descent with constraints

but not identical. This other question deals with solutions that are unit length, not necessarily step directions which are unit length.

Best Answer

This is a simple application of Cauchy-Schwarz.

$(g^Tp)^2 \leq \|g\|^2 \|p\|^2.$

So, $-\|g\| \leq \frac{g^T p}{\|p\|} \leq \|g\| $ and the lower bound is attained when $ p = -g $.

Similarly, $ (g^T p)^2 = \langle C^{-1/2}g , C^{1/2} p\rangle^2 \leq (g^T C^{-1} g) \times (p^T C p) $ and the lower bound is similarly attained.

Best Answer

Related Solutions

[Math] Parameter scaling in optimization

Related Question