Understanding derivation of dual for Infinite Linear Program

convex optimizationconvex-analysisduality-theoremsfunctional-analysisoptimization

I'm reading the section on Linear Programming in Barbu and Precupanu's Convexity and Optimization in Banach Spaces (p. 206 in the 4th edition), and had a couple of questions concerning their derivation of the dual problem for the infinite dimensional linear program:
$$ (\mathscr{P}) \quad \min\{(x_0^*,x): x\in P, \hspace{.1cm} y_0 – Ax\in Q\}$$
where $X$ and $Y$ are two Banach spaces, $P \subset X$, $Q\subset Y$ are closed convex cones, $A: X\to Y$ is a linear continuous operator, and $x_0^*\in X^*$ and $y_0\in Y$. They note that $\mathscr{P}$ can be obtained from
$$ (\mathscr{P}_1) \quad \min\{f(x)-g(Ax) \}$$
in which the perturbations are generated by translations, where $X$, and $Y$ are, again, real Banach spaces, $f: X\to ]-\infty, +\infty]$ is a proper, convex, and lower-semicontinuous function, $g: Y\to [-\infty, +\infty[$ is a proper, concave, and upper-semicontinuous function, and $A:X\to Y$ is a linear continuous operator. They take $f=x_0^*+I_P$ and $g = -I_{y_0-Q}$, and proceed to calculate conjugate functionals as follows:
$$\begin{aligned}
f^*(x^*) &= \sup\{(x^*-x_0^*,x): x\in P \} = I_{P^0}(x^*-x_0^*), \\
g^*(y^*) &= \{(y^*,y): y\in y_0-Q\} = (y^*,y_0)-\sup\{(y^*,y); y\in Q\}\\
&= (y^*,y_0) – I_{Q^0}(y^*)
\end{aligned}$$

Arriving at a dual problem associated with $\mathscr{P}_2$:
$$ (\mathscr{P}^*) \quad \max\{(y^*,y_0): y^*\in Q^0, A^*y^*-x_0^*\in P^0 \}$$
My questions about the derivation are as follows:

  • Why did they choose $f$ and $g$ as they did for this derivation? I'm having trouble relating it to the choices of $f$ and $g$ for the derivation of the dual problem for an LP in $\mathbb{R}^n$. That is to say for the canonical form of the finite dimensional problem, $$(\mathscr{P}_c)\quad \min\{\langle x,b\rangle _n:x\in\mathbb{R}^n,x\geq 0,Ax\geq c\}$$ they take $f(x) = \langle x,b\rangle _n $ for $x\geq 0$, and $+\infty$ otherwise, and $g(y)=0$ if $y\geq c$, and $-\infty $ otherwise.

  • Why is $\sup\{(x^*-x_0^*,x): x\in P \} = I_{P^0}(x^*-x_0^*)$? I believe $I_{P^0}(x^*-x_0^*)$ is the indicator function on the polar of the set $P$, but it seems as if this should be the support functional of $P^0$, since by definition for a subset $A$ of $X$: $I^*_A(x^*)=\sup\{(x,x^*); x\in A\}$ represents the equation of a supporting hyperplane of $A$, which in this case I believe is $P^0$. This would make sense to me since the basic idea behind duality and constructing the dual problem is recovering a closed convex set from its supporting hyperplanes.

  • In calculating the convex conjugate of $g$, why is $g^*(y^*) = \{(y^*,y): y\in y_0-Q\}$, and not $g^*(y^*) = \sup\{(y^*,y)-g(y): y\in y_0-Q\}$? I thought that for any function $f:X\to\overline{\mathbb{R}}$, we had that its convex conjugate was defined as $f^*(x^*) = \sup\{(x,x^*)-f(x); \quad x\in X\}, x^*\in X^*$. Again, my understanding of this geometrically, in terms of why we care about convex conjugate functionals, and relating to the construction of the dual was that we are in some sense picking the "smallest" slope for a hyperplane that is tangent to the convex function. Is there some simplification or assumption that I'm missing?

  • Lastly, why is $\{(y^*,y): y\in y_0-Q\} = (y^*,y_0)-\sup\{(y^*,y); y\in Q\}$? Without the introduction of $\sup$ as I imagine there should be, I don't see how they made that jump.

    I suspect that I'm missing something painfully obvious. Any help addressing my above questions would really help my understanding and be greatly appreciated 🙂

Best Answer

  1. 'Because it works'. If you pick a different $f$ and $g$ I am sure you can also make it work, but the structure of $\mathscr{P}_1$ makes it easy to derive the dual.

  2. The indicator function $I_{P^0}(x^*-x_0^*)$ takes values $0$ and $\infty$: it takes the value $0$ if $(x^*-x_0^*, x) \leq 1$ for all $x \in P$, which is the same as requiring $(x^*-x_0^*, x) \leq 0$ for all $x \in P$ (because $P$ is a cone). The expression $\sup\{(x^*-x_0^*,x): x\in P \}$ also takes values $0$ and $\infty$ (since $P$ is a cone): it takes the value $0$ if $(x^*-x_0^*,x)\leq 0$ for all $x\in P$. Therefore, the functions are identical.

  3. You have $g(y) = -I_Q(y_0 - y)$ , so $g^*(y^*) = \inf_y \{ (y^*,y) + I_Q(y_0 - y \}$. Substituting $z = y_0 - y$ gives: $g^*(y^*) = (y^*,y_0) - \sup_z \{ (y^*,z) - I_Q(z) \}$. For the last term you can use the same reformulation as in 2 to relate the support function of $Q$ to the indicator of the polar of $Q$.

Related Question