Why is this the optimal value of the dual problem

convex optimizationproof-explanation

Let the primal problem be to determine the feasibility of the system $f_0 =0, f_i(x) \le 0$ for $i=1, 2, \dots, m$ and
$h_i(x)=0$ for $i = 1, 2, \dots, m$.

The dual problem associated with this is to maximize $g(\lambda, v) =
\inf_{x \in D}(\lambda^Tf(x) + v^Th(x))$
.

The optimal value of the dual problem is given by

$d^* =
\begin{cases}
\infty, & \text{ if } \space\lambda \ge 0, g(\lambda, v)\gt 0 \text{ is feasible } \\[2ex]
0, & \text{ if } \space\lambda \ge 0, g(\lambda, v)\gt 0 \text{ is infeasible }
\end{cases}$

Why is the optimal value of the dual problem given as $d^*$? Why does $g(\lambda, v) \gt 0$ have to be feasible or infeasible? Why can't we have $g(\lambda, v) \le 0$ in any situation? If $g \lt 0$ then how can $d^*=0$?

Below is the full text from where this is taken. From the book Convex Optimization.


enter image description here

Best Answer

Let $f(x) = (f_1(x), ..., f_m(x))$ and $h(x)=(h_1(x), ..., h_p(x))$. Let $D$ be a set. The primal problem is to find $x \in D$ that satisfies $f(x)\leq 0, h(x)=0$. We say the primal problem is feasible if there is such an $x \in D$.

Define $$ M = \{(\lambda, v) : \lambda \in \mathbb{R}^m, v \in \mathbb{R}^p, \lambda \geq 0\}$$ The dual function $g(\lambda, v)$ should be maximized subject to the constraint $(\lambda, v) \in M$. Define $d^* = \sup_{(\lambda, v) \in M} g(\lambda, v)$. It is not difficult to show that $$d^* = \left\{ \begin{array}{ll} \infty &\mbox{ if the primal is not feasible } \\ 0 & \mbox{ if the primal is feasible} \end{array} \right.$$ So there are only two possibilities for $d^*$ (either $d^*=0$ or $d^*=\infty)$. It follows that if there is a $(\lambda, v) \in M$ such that $g(\lambda, v)>0$, then certainly the supremum of $g(\lambda, v)$ cannot be zero, so the supremum must be $\infty$. Conversely, if there is no $(\lambda, v) \in M$ such that $g(\lambda, v)>0$, then the supremum certainly cannot be $\infty$ and so the supremum must be $0$. Thus $$ d^* = \left\{ \begin{array}{ll} \infty &\mbox{ if there is a $(\lambda, v) \in M$ such that $g(\lambda,v)>0$. } \\ 0 & \mbox{ else} \end{array} \right.$$ This is likely what the equation in your question means.


I agree with Tony S.F. comments that the way of writing $d^*$ in your question (which I think is equivalent to my "second way" of writing $d^*$ above) is a bit unusual. However, in your cut-and-paste it seems the authors use the second way to prove the first way (not the other way round as I suggest above).

Actually it is not a bad way to prove it: The authors observe:

  • $(\lambda, v) \in M \implies (\alpha \lambda, \alpha v) \in M$ for any real number $\alpha \geq 0$.

  • $g(\alpha \lambda, \alpha v) = \alpha g(\lambda, v)$ for any real number $\alpha \geq 0$.

  • $(0,0) \in M$.

So if there is a $(\lambda, v) \in M$ such that $g(\lambda, v)>0$, we observe $\lim_{\alpha\rightarrow\infty}g(\alpha\lambda, \alpha v)\rightarrow \infty$ and so $\sup_{(\lambda, v) \in M} g(\lambda, v) = \infty$. On the other hand if there is no $(\lambda, v) \in M$ for which $g(\lambda, v)>0$ then the supremum must be less than or equal to 0; The supremum cannot be negative because $g(0,0)=0$, and so the supremum must be 0.

Related Question