Characterize subdifferential of a convex function by directional derivative

convex-analysisderivativesfunctional-analysisnormed-spaces

This thread is meant to record a question that I feel interesting during my self-study. I'm very happy to receive your suggestion and comments. See: SE blog: Answer own Question and MSE meta: Answer own Question.


Let $A$ be a subset of a normed space $X$ and $f:A \to \mathbb R$.

  • Let $a \in \operatorname{int} A$. For $v \in X$, the right directional derivative $f_{+}^{\prime}(a)[v]$, the left directional derivative $f_{-}^{\prime}(a)[v]$, and the (bilateral) directional derivative $f^{\prime}(a)[v]$ are defined by:
    $$
    \begin{aligned}
    f_{+}^{\prime}(a)[v] &= \lim _{t \to 0^+} \frac{f(a+t v)-f(a)}{t} \\
    f_{-}^{\prime}(a)[v] &= \lim _{t \to 0^-} \frac{f(a+t v)-f(a)}{t} \\
    f^{\prime}(a)[v] &= \lim _{t \to 0} \frac{f(a+t v)-f(a)}{t}.
    \end{aligned}
    $$

    We say that $f$ is Gâteaux differentiable at $a$ if $f^{\prime}(a) \in X^{*}$.

  • The subdifferential of $f$ at $a \in A$ is the set
    $$
    \partial f(a)=\left\{x^* \in X^* \mid f(x) – f(a) \ge \langle x^*, x-a \rangle \text { for each } x \in A\right\}.
    $$

    The elements of $\partial f(a)$ are called subgradients of $f$ at $a$.

Theorem: Assume $A$ is open convex and $f$ convex. For $a\in A$ and $x^* \in X^*$, the following assertions are equivalent:

  • (i) $x^* \in \partial f(a)$;
  • (ii) $x^*(v) \leq f_{+}^{\prime}(a)[v]$ for each $v \in X$;
  • (iii) $f_-^{\prime}(a)[v] \leq x^*(v) \leq f_{+}^{\prime}(a)[v]$ for each $v \in X$.

As a corollary, we obtain that $\partial f(a)$ is fully determined by the values of $f$ in any neighborhood of $a$.

Best Answer

Below, we use $x^*(v)$ and $\langle x^*, v \rangle$ interchangeably. Notice that $f'_-(a)[v] = -f'_+(a)[-v]$, so (ii) is equivalent to (iii).

Let's prove (i) implies (ii). Let $x^* \in \partial f(a)$, i.e., $$ f(x)-f(a) \ge \langle x^*, x-a \rangle \quad \forall x\in A. $$ Then $f(a+tv)-f(a) \ge t\langle x^*, v \rangle$ and thus $$ \frac{f(a+t v)-f(a)}{t} \ge \langle x^*, v \rangle \quad \forall t>0. $$ The claim then follows by taking the limit $t \to 0^+$.

Let's prove (ii) implies (i). Notice that $f$ is convex, so the map $$ \varphi:(0, +\infty) \to \mathbb R, t \mapsto \frac{f(a+t v)-f(a)}{t} $$ is increasing. Assume $\langle x^*, v \rangle \le f_{+}^{\prime}(a)[v]$ for each $v \in X$. Then $\langle x^*, v \rangle \le \varphi (t)$ for all $t>0$. We pick $t=1$ and $v=x-a$. Then $$ \langle x^*, x-a \rangle \le \frac{f(a+1(x-a))-f(a)}{1}. $$ This completes the proof.