Find a descent direction $d^k$ orthogonal to direction $d^{k-1}$ which is linear comb. of $d^{k-1}$ and $\nabla f(x_k)$

linear algebramaxima-minimamultivariable-calculusoptimization

In the process of minimizing a function $f:\mathbb{R}^n\to\mathbb{R},
f\in C^1$, the iteration $x^k$ was obtained by a linear search along
the direction $d^{k-1}$. Find a direction $d^k$ orthogonal to
$d^{k-1}$, which is a descent direction and that is a linear
combination of $d^{k-1}$ and $\nabla f(x_k)$

These are our conditions:

$$d^k = ad^{k-1} + b\nabla f(x_k)\\d^k\cdot d^{k-1} = 0\\d^k\cdot\nabla f(x_k)<0$$

so:

$$d^{k-1}(ad^{k-1} + b\nabla f(x_k)) = 0\implies a||d^{k-1}||^2 + bd^{k-1}\nabla f(x_k) = 0$$

$$a = -\frac{1}{||d^{k-1}||^2}bd^{k-1}\nabla f(x_k)$$

if we take $a$ like above, and any $b$, we have that the new direction is orthogonal to $d^{k-1}$. But we also need $d^k\nabla f(x_k)<0$ so let's enforce this on $d^k$ with our new $a$:

$$\nabla f(x_k)\cdot d^k = \left(\left( -\frac{1}{||d^{k-1}||^2}b||d^{k-1}||^2||\nabla f(x_k)||^2\right) + b||\nabla f(x_k)||^2\right) =$$
$$ -\frac{1}{b}||\nabla f(x_k)||^2 + b||f(x_k)||^2 <0 \implies -\frac{1}{b}+b<0\implies -1+b^2<0\implies b^2<1$$

So by picking $b = \frac{1}{2}$ (which has square less than $1$) we have this value for $a$:

$$a = -\frac{1}{||d^{k-1}||^2}\frac{1}{2}d^{k-1}\nabla f(x_k)$$

our direction $d^k$ is

$$d^k = \left(\left(-\frac{1}{||d^{k-1}||^2}\frac{1}{2}d^{k-1}\nabla f(x_k)\right)d^{k-1} + \frac{1}{2}\nabla f(x_k)\right) = -\frac{1}{2}\nabla f(x_k) +\frac{1}{2}\nabla f(x_k) = 0$$

the problem is that I'm getting the vector $0$ as a direction

Best Answer

Note: I am answering assuming the dot between two vectors to be inner product which i denote for ease as $\langle.,.\rangle$.

So, we know that according to the above calculation $$a = -\frac{1}{\|d^{k-1}\|^2}b\langle\nabla f(x_k), d^{k-1}\rangle$$

Now we show $\langle d^k,\nabla f(x_k)\rangle <0$. So first by substituting the obtained $a$ in $d^k = a d^{k-1} + b\nabla f(x_k)$ we have

$$ d^k = -b\langle\nabla f(x_k), d^{k-1}\rangle\frac{d^{k-1}}{\|d^{k-1}\|^2} + b\nabla f(x_k)$$

now, compute

$$\langle d^k,\nabla f(x_k)\rangle = -b \frac{\langle\nabla f(x_k), d^{k-1}\rangle^2}{\|d^{k-1}\|^2} + b \|\nabla f(x_k)\|^2$$

So, now choose b accordingly. We need to do case wise distinction here for the coefficient of b.

1) If $- \frac{\langle\nabla f(x_k), d^{k-1}\rangle^2}{\|d^{k-1}\|^2} + \|\nabla f(x_k)\|^2 < 0$, then choose any $b>0$ for $\langle d^k,\nabla f(x_k)\rangle <0$.

2) If $- \frac{\langle\nabla f(x_k), d^{k-1}\rangle^2}{\|d^{k-1}\|^2} + \|\nabla f(x_k)\|^2 > 0$, then choose any $b<0$ for $\langle d^k,\nabla f(x_k)\rangle <0$.

3) If $- \frac{\langle\nabla f(x_k), d^{k-1}\rangle^2}{\|d^{k-1}\|^2} + \|\nabla f(x_k)\|^2 = 0$, then there doesnt exist a solution, because this implies that the gradient $\nabla f(x)^k$ and $d^{k-1}$ are parallel, so there is no component of them in the orthogonal direction which can contribute to $d^k$.

Now assuming 1 or 2 is satisfied and choose $b$ accordingly, then the direction should be

$$d^k = b \left(-\langle\nabla f(x_k), d^{k-1}\rangle\frac{d^{k-1}}{\|d^{k-1}\|^2} + \nabla f(x_k) \right)$$

Related Question