Coercive/(weakly) semicontinuous function: extreme values

coerciveglobal optimizationmaxima-minimasemicontinuous-functions

Consider functionals of the form
$$\phi : X \rightarrow \mathbb{R} \cup\{+\infty\},$$
where $X$ is an arbitrary, normed vector space. In particular, $X$ may be of infinite dimension. I would be fine with restrictions like Banach-spaces or separabel/reflexive spaces though, if necessary.

Now I have found several different statements about the existence of minimizers/maximizers of $\phi$ and am little bit confused. So my first question is which of the following statement is true/false? (For the sake of simplicity I consider only minimum values.). Let $U \subset X$ be compact.

  1. If $\phi$ is lower semicontinuous, it attains its minimum on $U$.
  2. If $\phi$ is sequentially weakly lower semicontinuous, it attains its minimum on $U$.
  3. If $\phi$ is lower semicontinuous, it does not necessarily attain its minimum on $X$. Does it posses an infimum at least?
  4. Same as (3.) with sequentially weakly lower semicontinuous $\phi$.

Now let $\phi$ additionally be coercive and please consider statements (1)-(4) again.

My second question would be if anyone could recommend literature on this topic.

Best Answer

Coerciveness is a sufficient condition to avoid dealing with bounded constraint sets. If, for all sequences $x_n$ such that $||x_n||\rightarrow \infty$, $f(x_n) \rightarrow \infty$, then $f$ is coercive. The idea is that if this condition holds, you can restrict $f$ to bounded sets of increasing radius, and there will eventually be a local minimizer that eventually becomes a global minimizer, since the function becomes unbounded in every feasible direction. See Kinderlehrer and Stampaccia.

The definition of lower semi-continuous is that for all $x_n \rightarrow x$, $\phi(x_n) \ge \phi(x)$. So the idea is, let $\underline{\phi} = \inf_{x \in U} \phi(x)$. Take any sequence $x_n$ satisfying $$ \underline{\phi} + \dfrac{1}{n} \ge \phi(x_n) \ge \underline{\phi}. $$ In the classic Weierstrass version (like for $\mathbb{R}^N$), compactness of $U$ implies every sequence $x_n$ has a convergent subsequence $x_{n_k} \rightarrow x^*$. Then $$ \lim_{n_k \rightarrow \infty} \underline{\phi} + \dfrac{1}{n_k} \ge \lim_{n_k \rightarrow \infty} \phi(x_n) \ge \underline{\phi} $$ and lsc implies $$ \underline{\phi} + 0\ge \lim_{n_k \rightarrow \infty} \phi(x_n) \ge \phi(\lim_{n_k \rightarrow \infty} x_{n_k}) \ge \underline{\phi} $$ so that $\underline{\phi} = \phi(x^*)$ and $x^*\in U$ is the minimizer.

To generalize the result to more abstract spaces, the problem is existence of a convergent subsequence, $x_{n_k} \rightarrow x^*$. That is all that is going on with the word salad you're trying to sort out.

In a metric space, this all still works, because compactness is equivalent to sequential compactness, so any sequence in a compact set has a convergent subsequence.

The problems begin when you stop assuming $U$ is compact as a subset of an infinite dimensional vector space. The closed unit ball is not compact in an infinite dimensional vector space, despite being closed and bounded. So the Heine-Borel Theorem no longer applies, and if you want to prove $U$ is compact, you must show it is complete and totally bounded in a metric space, or that every open cover has a finite subcover in a topological space. This leads to other characterizations of compactness, like the Arzela Ascoli theorem. But you can't always just assume $U$ is compact, because sometimes it isn't.

Where does the weak stuff come from? Even thought the closed unit ball is not compact in infinite dimensional vector spaces, the closed unit ball in the dual space is weak* compact (Alaoglu's theorem). Because of this, you can find a convergent subsequence in the dual space, so that a weak* lower semi-continuous function on a weak* compact set achieves a minimum. (Luenberger, "Optimization by Vector Space Methods", ch 5)

The other kind of weak result goes like this. Kakutani's theorem is, a Banach space is reflexive iff the closed unit ball is compact in the weak topology. Then, if $E$ is a reflexive Banach space, $K \subset E$ nonempty, closed, convex and bounded, then $K$ is compact in the weak topology. Then, a coercive, convex lsc function $f$ that satisfies $f(x) \neq \infty$ for all $x \in E$ achieves a minimum on any nonempty, closed, convex subset of $E$. The strategy is to exploit the coercive property to get existence on any bounded subset, and then allow the bounded subsets to get large, and you find a global minimum (Brezis, "Functional Analysis, Sobolev Spaces, and PDE's", ch. 3).

There is yet another approach. If you are trying to minimize a distance in a Hilbert space to a convex set, you can exploit the parallelogram equality to extract a convergent subsequence. This works especially well when you have a point $x$ in a high dimensional vector space and have to project it onto a lower dimensional vector subspace, which is essentially all of statistics. (Kreyszig, Ch 4, or Luenberger, ch...3? I forget)

So the answer is that compactness+lsc in a metric space gets you existence every time, but all the other concepts come into play once you have an infinite dimensional space and a subspace that isn't assumed to be compact. Then you have to start considering other topologies and strategies to find your convergent $x_{n_k}$.

Related Question