Gradient of a function involving the Euclidean projection

convex-analysisjacobianprojectionvector analysis

Given a non-empty, closed and convex set $C \subset \mathbb{R}^n$, the Euclidean projection of $y \in \mathbb{R}^n$ onto $C$ is given by

$$ \pi_C (y) = \arg \min_{x \in C} \frac12 \| y-x\|_2^2. $$

I am interested in calculating the gradient of the following function

$$f(y) = \frac12 \| y – \pi_C(y) \|_2^2. $$

Applying the chain rule gives

$$ \nabla f(y) = (J \pi_C(y))^T (y – \pi_C(y)), $$

where $J\pi_C$ is the Jacobian matrix of $\pi_C$. A paper I'm reading gives $\nabla f(y)=y-\pi_C(y)$, without further explanation. My questions are:

  1. Is the paper correct?

  2. If 1. is afirmative, the Jacobian of the projection is equal to the identity. Why is it the case?

Besides the chain rule, another clue I'm following is that $f$ can be rewritten as

$$ f(y) = \min_{x \in C} \frac12 \|y – x\|_2^2.$$

Does this lead anywhere?

Best Answer

The paper is correct. My favourite way of arriving at this result is to use the Apslund function:

$$\varphi_C(x) = \frac{1}{2}\|x\|^2 - \frac{1}{2}d_C^2(x),$$

where $d_C(x) = \inf_{c \in C} \|x - c\|$. As it turns out, $\varphi_C$ is convex, and $\pi_C(y) \in \partial \varphi_C(y)$, where $\partial \varphi_C(y)$ is the subgradient of $\varphi_C$ at $y$. If we're happy to assume $\varphi_C(x)$ is differentiable, then this implies $\pi_C(y)$ is the derivative of $\varphi_C(y)$. Using the fact that the derivative of $\frac{1}{2}\|x\|^2$ is $x$, this tells us that the derivative of $\frac{1}{2}d_C^2(x)$ (the function in which you are interested) is $x - \pi_C(x)$, as the paper claims.

First, let's show that the Asplund function is convex. We can write:

\begin{align*} \varphi_C(x) &= \frac{1}{2}\|x\|^2 - \frac{1}{2}\inf_{c \in C}\|x - c\|^2 \\ &= \frac{1}{2}\sup_{c \in C}\left(\|x\|^2 - \|x - c\|^2\right) \\ &= \frac{1}{2}\sup_{c \in C}\left(\|x\|^2 - \|x\|^2 + 2(x \cdot c) - \|c\|^2 \right) \\ &= \sup_{c \in C}\left(x \cdot c - \frac{1}{2}\|c\|^2 \right), \end{align*} which is a supremum of affine functions, making it a convex function.

Note that the supremum is achieved precisely when the original infimum is achieved, which is at $\pi_C(x)$, so $$\varphi_C(x) = x \cdot \pi_C(x) - \frac{1}{2}\|\pi_C(x)\|^2.$$

To show $\pi_C(y) \in \partial \varphi_C(y)$, we must show $$\pi_C(y) \cdot (x - y) \le \varphi(x) - \varphi(y)$$ for all $x$. We have \begin{align*} &\varphi_C(x) - \varphi_C(y) - \pi_C(y) \cdot (x - y) \\ = \, &\frac{1}{2}\|x\|^2 - \frac{1}{2}d_C^2(x) - y \cdot \pi_C(y) + \frac{1}{2}\|\pi_C(x)\|^2 - \pi_C(y) \cdot x + \pi_C(y) \cdot y \\ = \, &\frac{1}{2}\|x - \pi_C(y)\|^2 - \frac{1}{2} d_C^2(x) \ge 0, \end{align*} as $\pi_C(y) \in C$, using the definition of $d_C(x)$. This puts $\pi_C(y) \in \partial \varphi_C(y)$, so assuming $\varphi_C$ is differentiable (which it is, if $C$ is closed and convex), then the derivative is $\pi_C(y)$, and the result follows.

To answer your second question, no, this doesn't make the Jacobean the identity. If you have a specific matrix $A$ and specific vector $v$ such that $Av = v$, then all you can conclude is either $v = 0$, or $v$ is an eigenvector corresponding to eigenvalue $1$. This is true of $J_{\pi_C}$, but generally speaking, it will not be the identity.

Related Question