Change of variable in optimization problem

optimization

I have an optimization problem of the form:
\begin{alignat}{4}
&\text{minimize}\quad &&f(x) \\
&\text{subject to} \quad && \|x\|_2 \leq \varepsilon
\end{alignat}

with variable $x \in \mathbb{R}^n$ where $f$ is Lipschitz continuous.

I want to convert this problem to have only box constraints. So I want to convert this to:

\begin{alignat}{4}
&\text{minimize}\quad &&f(g(z)) \\
&\text{subject to} \quad && z_i \in [a_i, b_i] \quad \text{for} \quad 1 \leq i \leq n
\end{alignat}

Ideally $g$ should also be Lipschitz continuous. Any idea what function I can use to do this change of variable to parametrize the $L_2$ norm-ball? Is it easy to do so?

The reason I want to do this change of variable is to utilize the method in this paper that uses adaptive nested optimization: https://link.springer.com/article/10.1007/s10898-015-0355-7 .

Best Answer

In general, I don't think it's very easy, but there are a few things you can do. Firstly $\|x\|_2 \leq \epsilon$ is equivalent to $\|x\|_2^2 \leq \epsilon^2$, that is

$$ \sum_{i=1}^{n}x_i^2 \leq \epsilon^2.\tag{1}\label{1} $$

You can then introduce $n$ auxiliary scalar decision variables, $t_1,\ldots, t_n$ and replace \eqref{1} by

\begin{align} |x_i| {}\leq{}& t_i, \\ \sum_{i=1}^{n}t_i {}\leq{}& \epsilon^2. \end{align}

In case you cannot handle $\sum_{i=1}^{n}t_i {}\leq{} \epsilon^2$ directly, you can associate with it the penalty function $\psi(t) = \max\{\sum_{i=1}^{n}t_i -\epsilon^2, 0\}$ and define the modified cost function

$$ \tilde{f}(x,t; \lambda) = f(x) +\lambda \psi(t). $$

This is a Lipschitz continuous and you can apply a penalty method on top of your algorithm (or an augmented Lagrangian method - you have to see which one works better). Simply take $\lambda \to \infty$ and stop once $\psi(t)$ becomes sufficiently small.

Let me just clarify that for each $\lambda$ you will have to solve the following problem

\begin{align} &\operatorname*{Minimize}_{x,t} \tilde{f}(x,t; \lambda) \\ &\text{subject to } |x_i| \leq t_i, \text{ for all } i =1,\ldots, n \end{align}

Update 1. Under the stronger assumption that $f$ is continuously differentiable with Lipschitz gradient, it turns out the the original problem is equivalent to the unconstrained minimization problem

$$ \operatorname*{Minimize}_{x} f(x) - \frac{\gamma}{2}\|\nabla f(x)\|^2 + \frac{\gamma}{2}\operatorname{dist}^2_D(x - \gamma \nabla f(x))^2,\tag{2}\label{2} $$

where $D$ is the set of box constraints (I'm using the same notation as the paper you cited), and $\operatorname{dist}^2_D$ is the squared distance from $D$, which is easy to compute. The cost function in \eqref{2} is known as the forward-backward envelope of the original problem (see for example this paper).

If $f$ is just Lipschitz the above does not apply. Generally speaking, this is too weak an assumption in optimization (with the exception of global optimization where people resort to branch-and-bound-type methods).

Update 2. You can use a homeomorphism between $B_2 := \{x\in\mathbb{R}^n: \|x\|_2 \leq 1\}$ and $B_\infty := \{x\in\mathbb{R}^n: \|x\|_\infty \leq 1\}$, i.e., a function $\phi: B_2\to B_{\infty}$ such as

$$ \phi: B_{\infty} \ni x \mapsto \phi(x) = \frac{\|x\|_\infty}{\|x\|_2}x \in B_2 $$

for $x\neq 0$ and $\phi(0) = 0$

The inverse is

$$ \phi^{-1}: B_2\ni z \mapsto \phi^{-1}(z) = \frac{\|z\|_2}{\|z\|_\infty}z \in B_{\infty} $$

for $z\neq 0$ and $\phi^{-1}(0) = 0$.

The constraints in your problem can be written as $\frac{x}{\epsilon} \in B_2$. Define $u = \phi^{-1}(x/\epsilon)$. Then $\frac{x}{\epsilon} = \phi(u) \Leftrightarrow x = \epsilon \phi(u)$. The constraints become

\begin{align} \frac{x}{\epsilon} \in B_2 \Leftrightarrow{}&{}\frac{x}{\epsilon} \in \phi (B_\infty) \\ \Leftrightarrow{}&{}\phi(u) \in \phi (B_\infty) \\ \Leftrightarrow{}&{}u \in B_{\infty} \end{align}

and the optimization problem becomes

\begin{align} &\operatorname*{Minimize}_u f(\epsilon \phi(u)) \\ &\text{subject to } -1 \leq u \leq 1 \end{align}

Once you find an optimal solution $u^\star$ (if one exists), you can retrieve $x^\star = \epsilon \phi(u^\star)$.

Related Question