[Math] Projected gradient descent

gradient descentoptimization

$\newcommand{\R}{\mathbb{R}}$There's the following problem that I found in a book: Let function $f:\R^2\to\R$ be defined as

$$
f(x) = x_1 + x_1x_2 + (1 + x_2)^2
$$

Considering the feasible set

$$
X = \{x \in \R^2 {}\mid{} 0 \leq x_1 \leq 1, 1 \leq x_2 \leq 2\}
$$

apply the first step of Projected Gradient Descent on

min f(x) 
x∈X
  s.l. x ∈ X

starting from z0 = [1 0]^T and step a=0.1


This problem is solved in the book in the following manner, and I quote:

For the iteration of projected gradient there are two things to be done:

i) calculating the gradient step:
$$
y = z_0 – \alpha \nabla f(z_0)
= \begin{bmatrix}1\\0\end{bmatrix}
– 0.1 \begin{bmatrix}4\\3\end{bmatrix}
= \begin{bmatrix}0.6\\-0.3\end{bmatrix}
$$

ii) calculating the projection sums up to solving this problem:

enter image description here

You can notice that the problem is separable in 2 independent scalar problems:

enter image description here

and

enter image description here

The solutions to the scalar problems are given by the projection of y1, y2 on the intervals [0,1], [1,2], which leads to z1* = 0.6 and z2* = 1. We conclude:

enter image description here


End quote.

This is the solution that can be found in the book, for this problem.

I am a bit confused and I feel as there are a few steps missing. I was able to follow until it got to the "ii)", where did that relation come from?

And three other questions that I have:

1) who are z1,z2 and y1,y2

2) how were the projections of y1, y2 on the intervals [0,1], [1,2] calculated?

3) who are z1*,z2*?

If somebody could maybe add the missing steps of the solution, that would be great.

Best Answer

Let us begin at step (i). That is the step in which they have computed the next point of iteration. Now, in projected gradient descent, we need to ensure that the point arising from the gradient should lie in the feasible set which in our case is $X = \{ (x_1, x_2) \in \mathbb{R}^2 | 0 \leq x_1 \leq 1, \ 1 \leq x_2 \leq 2 \}$. As you can note that $y \notin X$ and hence we would need to find the projection of $y$ on to the set $X$ to ensure that we are still in the feasible set.

The projection of a point on to a set is the point in the set that is closest to it. Mathematically speaking, $z^* = (z_1^*, z_2^*)$ is the projection of $y$ on the set $X$ if $z^* = \text{arg}\min_{z \in X} \| z - y \|^2$. Here $z = (z_1, z_2)$ is the variable of optimisation and $y = (y_1, y_2)$ is the value obtained in the first step.

Now consider the term $\| z - y \|^2 = (z_1 - y_1)^2 + (z_2 - y_2)^2$ where $z_1, z_2$ are variables and $y_1, y_2$ are constants for the optimisation. Notice that the first term on RHS only depends on $z_1$ and the second term only depends on $z_2$. This implies that we can minimise the first term only by changing $z_1$ and the second term only by changing $z_2$ and thus minimise the entire expression by individually minimising the two expressions. Hence, we have now got two independent scalar optimisation problems.

Let us consider solving them. So we have, $\min_{0 \leq z_1 \leq 1} (z_1 - y_1)^2$. Now the quadratic expression is minimised if $z_1 = y_1$. However, we also need to check if it lies in the feasible set. For this case, it does and hence we are good and we obtain $z_1^* = y_1 = 0.6$. Similarly, we consider $\min_{1 \leq z_2 \leq 2} (z_2 - y_2)^2$. Here the minimiser $y_2$ does not lie in the feasible set. So to find the point, note that $(z_2 + 0.3)^2$ is an increasing function of $z_2$ for $z_2 > 0$. Thus, to minimise the function, we will have to choose the smallest feasible value of the $z_2$ and that is $1$. This gives us $z_2^* = 1$ as required. I would suggest you should try plotting a graph and see how this is working geometrically.

As a comment, I would like to say that I have used a rather elaborate method to ensure that everything is easy to follow. However, I would recommend to try some similar problems of projection and optimisation as that will give you a better idea of what exactly is involved in such problems.

Hope this helps!