What are the subgradients of $f(x,y)=|x|+|y|$ and $f(x,y)=\sqrt{x^2+y^2}$

convex-analysissubgradient

I am trying to understand how can one compute subgradients for functions $f: \mathbb{R}^2 \supset M \to \mathbb{R}$.

I know that $\xi \in \mathbb{R}^2$ is a subgradient for function $f$ in $x_0$ if the following statement is true
$$f(x) \ge f(x_0) + \xi^T(x – x_0), \quad x \in M.$$

However when it comes to practice I am in trouble.

How can I compute subgradient for the following functions

  1. $f(x, y) = |x| + |y|$,
  2. $f(x, y) = \sqrt{x^2 + y^2}$?

I would appreciate a step by step solution.

Best Answer

Note that both of your functions are convex, which is a requirement for the subgradient to be defined. If $f$ is differentiable at a point $(x, y)$, then $\partial f = \{ \nabla f \}$, leaving only the computation of the subgradient at points where $f$ is not differentiable.

For these computations, it helps to note that the defining inequality of the subgradient $f(x) \ge f(x_0) + \xi^T (x - x_0)$ looks a lot like the inequality $f(x) \ge f(x_0) + (\nabla f)^T (x - x_0)$, which says that the function is above the tangent plane, which is true for convex functions. We can visualize the problem by trying to find "gradients" $\xi$ where the function is not differentiable that also have this property. Think about how for the function $f(x) = |x|$ you could find many "tangent lines" at $x = 0$ that lie below the graph by taking any line with slope $t \in [-1, 1]$ going through the origin. This is what we want to do in the two variable case.

For (1.), you can take the gradient at points where $f(x, y) = |x| + |y|$ is differentiable to compute the subgradient. So $$ \nabla f (x, y) = (\text{sgn}(x), \text{sgn}(y)), \text{ if } x \ne 0 \text{ and } y \ne 0. $$

For points where $f$ is not differentiable, i.e. where $x = 0$ or where $y = 0$, we need to do some more work. Suppose that $y = 0$ and $x > 0$. Then certainly $f$ is not differentiable, but $\frac{\partial f}{\partial x} = 1$, and "from above" we have $\frac{\partial f}{\partial y} = 1$ and "from below" we have $\frac{\partial f}{\partial y} = -1$, meaning the limits for the partial derivative are taken with $y > 0$ and $y < 0$ respectively. We should now be able to guess that for $x_0 > 0$, $$\partial f (x_0, 0) = \{ (1, t), \text{ where } t \in [-1, 1] \},$$ which we can check against the definition $$ |x| + |y| = x + |y| \ge x_0 + |0| + (1, t)^T (x - x_0, y - 0) = x + ty \text{ , where } t \in [-1, 1] $$ and certainly this is true, because $|y| \ge ty$ where $t \in [-1, 1]$. We can also prove that for any vector not in this set, there exists a point $(x, y)$ where the inequality fails. For example, if $t > 1$, picking $(x, y) = (x, y_0 + \epsilon)$ for $\epsilon > 0$ will violate this inequality.

The other cases are similar, except for the point $(0, 0)$. At this point, we can probably guess based on the other four cases that $\partial f (0, 0) = \{ (t, s) \text{ where } t \in [-1, 1] \text{ and } s \in [-1, 1]\}$, which we can also check in the definition directly: $$ |x| + |y| \ge (t, s)^T (x, y) = tx + sy \text{ , where } t \in [-1, 1] \text{ and } s \in [-1, 1]. $$ and once again you can check that any vector not in this set will have a point $(x, y)$ where this fails. We conclude that $$ \partial f (x, y) = \begin{cases} \{ (\text{sgn}(x), \text{sgn}(y)) \} ,& \text{ if } x \ne 0, y \ne 0 \\ \{ (\text{sgn}(x), t) \text{ where } t \in [-1, 1] \},& \text{ if } x \ne 0, y = 0 \\ \{ (t, \text{sgn}(y)) \text{ where } t \in [-1, 1] \},& \text{ if } x = 0, y \ne 0 \\ \{ (t, s) \text{ where } t \in [-1, 1] \text{ and } s \in [-1, 1]\},& \text{ if } x = 0, y = 0 \end{cases} $$

For (2.), note that it is very similar to the last case for (1.), as the only point where the function is not differentiable is $(0, 0)$.

Related Question