Proof that local minimizer of a convex function is also a global minimizer

convex-analysisconvex-geometrymaxima-minimaoptimizationproof-explanation

Convex functions in $\mathbb{R}$

Let $f:\mathbb{R} \rightarrow (-\infty,\infty]$ be a convex function. Then $f$ possesses side derivatives for all $x \in \text{dom}(f)$:
$$
f'_l(x):= \lim_{y\rightarrow x, y <x}\frac{f(y)-f(x)}{y-x}
\quad\text{and}\quad
f'_r(x):=
\lim_{y\rightarrow x, y >x}\frac{f(y)-f(x)}{y-x}
$$

with possible values in $[-\infty, \infty]$.
Furthermore, for arbitrary $x,y \in \text{dom}(f)$ with $x<y$:
$$
f'_l(x)\leq f'_r(x) \leq \frac{f(y) – f(x)}{y-x}\leq f'_l(y)\leq f'_r(y)
$$

In particular, $f'_l$ and $f'_r$ are monotonically increasing on dom$(f)$.

Convex functions in $\mathbb{R}^n$

Let $f:\textbf{V}\rightarrow (-\infty, \infty]$ be a convex function.

For any point $\textbf{x} \in \text{dom}(f)$ and vector $\textbf{v}\in\textbf{V}$, both side direction derivatives exist, e.g.,
$$
f'_r(\textbf{x},\textbf{v}):= \lim_{t\rightarrow 0, t>0}\frac{f(\textbf{x} + t\textbf{v}) – f(\textbf{x})}{t}\in [-\infty, \infty]
$$

Proof: This simply follows from the existence of side derivatives of convex functions on $\mathbb{R}$, applied to $\mathbb{R} \ni t \rightarrow f(\textbf{x}+t\textbf{v})$.

As a function of $\textbf{v}$, $f'_r(\textbf{x},\textbf{v})$ is sublinear i.e.:
\begin{align*}
f'_r(\textbf{x},\lambda\textbf{v})=&\lambda f'_r(\textbf{x},\textbf{v}) \text{ for } \textbf{v}\in \textbf{V}, \lambda \geq 0,\\
f'_r(\textbf{x},\textbf{v}+ \textbf{w})\leq& f'_r(\textbf{x},\textbf{v}) + f'_r(\textbf{x},\textbf{w})\\
&\text{for } \textbf{v}, \textbf{w}\in \textbf{V} \text{ with }\{f'_r(\textbf{x},\textbf{v}), f'_r(\textbf{x},\textbf{w})\}\neq \{-\infty, \infty\}
\end{align*}

In particular, $f'_r(\textbf{x}, \cdot)$ is also a convex function, as soon as it does not take the value $-\infty$. From the previous theorem on side derivatives follows also:
$$
f(\textbf{x})+ f'_r(\textbf{x}, \textbf{v}) \leq f(\textbf{x}+ \textbf{v})\text{ for all }\textbf{v}\in \textbf{V} \quad (1)
$$

The previous assertion, together with the definition of directional derivatives, imply that local minimizer of a convex function is also a global minimizer (2)

My questions

  • How does the definition of the side derivatives helps to prove (1)?
  • How can we conclude from that the conclusion in (2)?

Best Answer

Nope, I don't like this proof. The result doesn't need derivatives at all, and I insist on presenting an alternative proof first. I know you tagged this proof-explanation (and I will get to explaining it), but first I want to prove this properly.

Suppose $f$ is convex, and attains a local minimum at $x$. Specifically, there is some $\varepsilon > 0$ such that if $\|y - x\| < \varepsilon$, then $f(y) \ge f(x)$.

Now, consider some $z$ in $\Bbb{R}^n$. The idea is to construct a convex combination $y$ of $z$ and $x$ so that $\|y - x\| < \varepsilon$. Then, use the local minimum property, and the standard convexity inequality. Let us take, $$y = \lambda z + (1 - \lambda)x$$ where $\lambda \in [0, 1]$ is small enough so that $\|y - x\| < \varepsilon$. Solving, $$\varepsilon > \|y - x\| = \|\lambda z + (1 - \lambda)x - x\| = \lambda\|z - x\|,$$ so we will choose $\lambda < \min\{1, \varepsilon / \|x - z\|\}$. The actual value of $\lambda$ doesn't matter, so long as it exists and satisfies the given requirements. Then, \begin{align*} f(x) \le f(y) \le \lambda f(x) + (1 - \lambda)f(z) &\implies -(1 - \lambda)f(x) + (1 - \lambda)f(z) \ge 0 \\ &\implies f(x) \le f(z), \end{align*} i.e. $f$ is a global minimum.


As for your actual questions,

(1) First note that, for any $t \in [0, 1]$, \begin{align*} f(x + v) - \frac{f(x + tv) - f(x)}{t} - f(x) &= \frac{tf(x + v) - f(x + tv) + f(x) - tf(x)}{t} \\ &= \frac{tf(x + v) + (1 - t)f(x) - f(x + tv)}{t} \\ &\ge \frac{f(t(x + v) + (1 - t)x) - f(x + tv)}{t} \\ &= 0. \end{align*} Thus, taking limits of both sides as $t \to 0^+$ (from above), $$0 \le f(x + v) - \lim_{t \to 0^+} \frac{f(x + tv) - f(x)}{t} - f(x) = f(x + v) - f'_r(x, v) - f(x),$$ which is equivalent to what you want proven.

(2) If $x$ is a local minimum, then it will have non-negative directional derivative in every direction. Thus, $$f(x + v) \ge f'_r(x, v) + f(x) \ge 0 + f(x).$$ This holds for all $v$ (not necessarily small), hence the minimum is global.

Related Question