After introducing the linear constraints, you just form the Lagrangian and minimize it:
$$\mathcal{L}(x,y_i,\lambda_i)=\sum_i \|y_i\|+\lambda_i^T(y_i - A_i x+b_i)$$
$$\inf_{x, y_i} \mathcal{L}(x,y_i,\lambda_i)
=\begin{cases} -\infty & \text{ if } \sum_i A_i^T\lambda_i \ne 0 \ \text{ or } \ \|\lambda_i\| > 1 \\
\sum_i b_i^T \lambda_i & \text{ otherwise }
\end{cases}
$$
So the dual problem is:
$$
\begin{gathered}\max_{\lambda_i} \sum_i b_i^T \lambda_i\\
\text{s.t.}: \sum_i A_i^T\lambda_i = 0, \\
\quad\|\lambda_i\| \le 1, \ \forall \ i.
\end{gathered}
$$
- One issue is going from:
$a_{i}^{T}Xa_{i} \leq 1$
to:
$\mbox{tr}((a_{i}a_{i}^{T})X) \leq 1$.
This is done using a couple of standard tricks:
A. Since $\mbox{tr}(s)=s$ for any scalar,
$a_{i}^{T}Xa_{i} = \mbox{tr}(a_{i}^{T}Xa_{i}) \leq 1$.
B. By the cyclic property of the trace of a product,
$\mbox{tr}((a_{i}a_{i}^{T})X) \leq 1$.
A second issue is where $f_{0}^{*}(Y)$ comes from. This is discussed in example 3.23 on page 92 of the book.
The final step is applying (5.11) to the problem to get (5.15)
Our primal problem is problem is
$\min \log \det \left( X^{-1} \right) $
subject to
$ \mathcal{A} X \preceq 1 $
where $\mathcal{A}$ is the linear operator
$\mathcal{A}X=\left[ \begin{array}{c}
\mbox{tr}((a_{1}a_{1}^{T})X) \\
\mbox{tr}((a_{2}a_{2}^{T})X) \\
\vdots \\
\mbox{tr}((a_{m}a_{m}^{T})X)
\end{array}
\right]. $
The transpose of $\mathcal{A}$ is
$\mathcal{A}^{T}y=\sum_{i=1}^{m} y_{i} \left( a_{i}a_{i}^{T} \right). $
We have from (5.11) that
$g(\lambda) = \inf_{x} -b^{T}\lambda - f_{0}^{*}(-\mathcal{A}^{T}\lambda)$
where I've suppressed the $\nu$ terms since there are no equality constraints.
Thus
$g(\lambda)=\inf_{x} -1^{T} \lambda -f_{0}^{*}(-\mathcal{A}^{T}\lambda)$
Then
$g(\lambda)=-1^{T}\lambda -\left( \log \det \left( -(-\mathcal{A}^{T}\lambda)^{-1} \right) -n \right)$
as long as $\mathcal{A}^{T}\lambda \succ 0$.
Since $\log \det Z^{-1} = -\log \det Z$ for postivde definite $Z$, this simplifies to
$g(\lambda)=
\left\{
\begin{array}{ll}
-1^{T}\lambda + \log \det \left( \mathcal{A}^{T}\lambda \right) +n & \mathcal{A}^{T}\lambda \succ 0 \\
-\infty & \mbox{otherwise.}
\end{array}
\right. $
Best Answer
As you suggested, you can define indicator functions for the sets:
$$ g_i(x) = \begin{cases} 0 & \ x \in X_i \\ +\infty & \ x \notin X_i \end{cases} $$ These are dual to the support functions of those sets: $$ g_i^*(\lambda) = \sup_{x \in X_i} x^T\lambda $$
Now you can formulate the problem like a consensus optimization problem:
$$ \begin{gathered} \inf_{x_i,y_i,z} \sum_i f_i(x_i) + g_i(y_i)\\ \text{s.t.:} \ x_i = z,\ y_i = z \end{gathered} $$
Let's form the Lagrangian for this constrained problem and minimize over the "local" variables $x_i,y_i$ to get an expression in terms of the convex conjugates: $$ \inf_{x_i,y_i,z} \sum_i f_i(x_i) + g_i(y_i) + u_i^T(z-x_i) + v_i^T(z-y_i) \\= \inf_{z} \sum_i -f_i^*(u_i) - g_i^*(v_i) + (u_i+v_i)^T z $$ Minimizing over the "consensus" variable $z$ gives an implicit equality constraint. The dual problem is thus: $$ \begin{gathered} \sup_{u_i,v_i} \sum_i -f_i^*(u_i) - g_i^*(v_i)\\ \text{s.t.:} \ \sum_i u_i + v_i = 0 \end{gathered} $$ Some caveats: this formulation is only really useful if the sets $X_i$ admit simple support functions $g_i^*$. Also, while this always gives a lower bound of the optimal value of the original problem, to get strong duality there are some technical conditions you need. (See the hypotheses of Fenchel's duality theorem in a convex analysis text.)