Separating Hyperplanes and Hahn-Banach

convex-analysisfunctional-analysis

Let $f:X \rightarrow \mathbb{R}$ be convex, $x \in X$. Assume further that $f$ is continuous and finite in $x.$ Then it says that by Hahn-Banach there is $x^* \in X^*$ with

\begin{equation}
\langle y – x, x^* \rangle_{X \times X^*} + f(x) \leq f(y) \ \ \ \forall y \in X.
\end{equation}

I'm trying to understand why this is true. As far as I know, by Hahn-Banach we can separate the convex set given by the points above the graph of $f$ from the point $(x,f(x))$ by some hyperplane. More generally, for $A,B \subset X$, $A\cap B= \emptyset$ convex sets with $A$ open there is $x^* \in X^*$ and $\gamma \in \mathbb{R}$ such that

\begin{equation}
\langle a,x^* \rangle _{X \times X^*} \leq \gamma \leq \langle b,x^* \rangle _{X \times X^*} \ \ \ \forall a\in A,\ b \in B. \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (1)
\end{equation}

Intuitively, the above estimate makes sense. The desired hyperplane is an affine function below the graph of $f$ which intersects with the graph in the point $(x,f(x))$. But I would like to understand how this follows from the formal statement $(1)$ in the Hahn-Banach Theorem. Thank you for any help.

Best Answer

Let me add another answer to address your modified question.

What we want to do is to use the Hahn-Banach separation theorem with the open set $(\text{epi} f)^\circ$ and the compact set $\{(x_0,f(x_0)\}$. Continuity at $x$ guarantees that $(\text{epi} f)^\circ$ is not empty, which is not the case if $f$ is a discontinuous linear functional.

Recall that the epigraph $\text{epi} f:= \{(y,r)\in X\times\Bbb R: r\ge f(y)\}$ is convex, and that $(X\times \Bbb R)^*\simeq X^*\times \Bbb R$. Note also that for a convex set $C$ with nonempty interior, we have $\overline C = \overline{(C^\circ)}$.

Let $x_0$ be a point in $X$, take a small $\varepsilon>0$ then by continuity of $f$ we can find a neighborhood $V$ of $x_0$ such that $f(y)<f(x_0)+\varepsilon/2$ for all $y\in V$. This means that $$ V\times (f(x_0)+\varepsilon,+ \infty) \subset \text{epi} f $$ and hence $(\text{epi} f)^\circ\ne \emptyset$.

Now, let's apply the Hahn-Banach separation theorem to the open set $(\text{epi} f)^\circ$ and the compact set $\{(x_0,f(x_0)\}$ to get $(x^*,\lambda)\in X^*\times \Bbb R$ such that $$ \langle x_0,x^* \rangle + \lambda f(x_0) > \langle y,x^* \rangle + \lambda r $$ for all $(y,r)\in (\text{epi} f)^\circ$.

The above argument with small enough $\varepsilon$ shows that $(x_0,r)\in (\text{epi} f)^\circ$ for all $r>f(x_0)$, substitute this into $(y,r)$ and rearrange the terms to get $$ \lambda(f(x_0)-r) > 0 $$ which implies that $\lambda<0$. We may thus rescale $(x^*,\lambda)$ so that $\lambda=-1$.

Since $\text{epi} f \subset \overline{\text{epi} f} =\overline{(\text{epi} f^\circ)} $ and that $(y,f(y))\in \text{epi}f$, $$ \langle x_0,x^* \rangle - f(x_0) \ge \langle y,x^* \rangle - f(y) $$ for all $y\in X$ (we take limit $r\to f(y)$ so the strict $>$ becomes $\ge$). Hence the (rescaled) functional $x^*$ is a subgradient of $f$ at $x_0$, i.e. $x^*\in \partial f(x_0)$.

Related Question