I shall prove that the problems are in fact equivalent, in the sense that they both have the same optimal solution $x^*$. Let $P_1$ be the first optimization problem, and $P_2$ the second, and let $\alpha\succ 0$.
If $\tilde{x}$ is the optimal solution for $P_1$, then we know from the book that each $\tilde{x}_i$ satisfies the relation
$$
\tilde{x}_i = \max\{0,1/\tilde{\nu} - \alpha_i\},
$$
where $\tilde{\nu}$ is the optimal dual variable associated with the equality constraint.
Now, write $P_2$ equivalently as
$$\begin{split}
\text{minimize} &\quad t\\
P_2:\quad\text{subject to}&\quad t\geq -\log(\alpha_i + x_i),\quad i=1 ,\ldots,n\\
&\quad x\succeq0,\quad \mathbf{1}^\top x=1.
\end{split}.$$
Then, the Lagrangian is
$$L(t,x,\mu,\lambda,\nu) = t - \sum_{i=1}^n \mu_i(\log(\alpha_i + x_i) + t) - \lambda^\top x + \nu(\mathbf{1}^\top x - 1) =\\t(1 - \mathbf{1}^\top\mu) - \sum_{i=1}^n\mu_i\log(\alpha_i + x_i) - \lambda^\top x + \nu(\mathbf{1}^\top x - 1).$$
Using this, you can write the KKT conditions of $P_2$ for the primal and dual optimal variables $t^*,x^*,\mu^*,\lambda^*$ and $\nu^*$ as
$$x^*\succeq 0,\quad \mathbf{1}^\top x^* = 1,\quad \lambda^*\succeq0,~\mu^*\succeq 0,\quad \lambda_i^*x_i^*=0,\quad t^* + \log(\alpha_i + x_i^*) \geq 0,\quad \mu_i^*(t^* + \log(\alpha_i + x^*_i)) = 0,\quad i=1,\ldots,n,$$
and
$$\partial/\partial x_i \left[ L(t^*,x^*,\mu^*,\lambda^*,\nu^*)\right]= -\mu_i^*/(\alpha_i^* + x_i^*) - \lambda_i^* + \nu ^* = 0,~i=1,\ldots,n,\quad \mathbf{1}^\top\mu^* = 1.$$
The last equality is necessary, otherwise the dual becomes unbounded in $t^*$. Eliminating $\lambda^*$ yields:
$$x^*\succeq 0,\quad \mathbf{1}^\top x^* = 1,\quad x_i^*(\nu^* - \mu_i^*/(\alpha_i + x_i^*))=0,\quad \nu^*\geq \mu_i^*/(\alpha_i + x_i^*),\quad i=1\ldots,n,$$
and
$$\mu^*\succeq 0,\quad \mathbf{1}^\top\mu^* = 1,\quad t^* + \log(\alpha_i + x_i^*) \geq 0,\quad \mu_i^*(t^* + \log(\alpha_i + x^*_i)) = 0,\quad i=1,\ldots,n.$$
These are in fact very similar conditions to those of $P_1$. Now, let $M = \{i\in \{1,\ldots,n\} ~|~ \mu_i^* > 0\}$. This is a nonempty set, because $\mathbf{1}^\top \mu^* = 1$. For all $i\in M$, you can thus deduce (in a similar way to the book) that in fact
$$x_i^* = \max\{0, \mu_i^*/\nu^* - \alpha_i\}.$$
Furthermore, we have that $t^* = -\log(\alpha_i + x_i^*)$.
Now, let $M' = \{1,\ldots,n\}\setminus M$. In this case, we have that for all $i\in M'$:
$$
x_i^*\nu^* = 0,\quad \nu^* \geq 0,\quad t^* + \log(\alpha_i + x_i^*) \geq 0.
$$
We can't have $\nu^* = 0$, otherwise it would imply that $0 < \mu_j^*/(a_j + x_j^*) \leq 0$, for all $j\in M$, which is impossible. So, the only option is that $x_i^* = 0$ for all $i\in M'$.
Thus, the problem reduces to solving the equation:
$$
\sum_{i\in M} x_i^* = \sum_{i\in M}\max\{0,\mu_i^*/\nu^* - \alpha_i\} = 1.
$$
However, the choice of $\mu$ can be arbitrary, as long as it satisfies the KKT conditions. Let $K = \{i\in\{1,\ldots,n\}~|~ \tilde{x}_i > 0\}$, and suppose we set the multipliers to
$$
\mu_i^* = \begin{cases}
1/\lvert K \rvert & i \in K\\
0 &\text{otherwise,}
\end{cases}
$$
then $\mathbf{1}^\top \mu^* = 1, \mu^*\succeq 0$. Clearly, we also have $M = K$, which implies $x_i^* = \tilde{x}_i=0$ for all $i\in M'$. Now, setting $\nu^* = \tilde{\nu}/\lvert M \rvert > 0$, then this implies $x_i^* = \tilde{x}_i$ for all $i\in M$, and we get exactly the same solution as $P_1$. And since $\tilde{x}_i + \alpha_i = \tilde{x}_j + \alpha_j$ for all $i,j\in M, i\neq j$, then this implies
$$
\log(x_i^* + \alpha_i) = \log(x_j^* + \alpha_j),
$$
thus $t^* = -\log(x_i^* + \alpha_i)$ for all $i\in M$, so all of the KKT conditions are satisfied.
Therefore, solving $P_2$ is equivalent to solving $P_1$. The optimal value of $P_2$ is $-\log(\alpha_i + \tilde{x}_i)$, for any $\tilde{x}_i > 0$.
Best Answer
In general, I don't think it's very easy, but there are a few things you can do. Firstly $\|x\|_2 \leq \epsilon$ is equivalent to $\|x\|_2^2 \leq \epsilon^2$, that is
$$ \sum_{i=1}^{n}x_i^2 \leq \epsilon^2.\tag{1}\label{1} $$
You can then introduce $n$ auxiliary scalar decision variables, $t_1,\ldots, t_n$ and replace \eqref{1} by
\begin{align} |x_i| {}\leq{}& t_i, \\ \sum_{i=1}^{n}t_i {}\leq{}& \epsilon^2. \end{align}
In case you cannot handle $\sum_{i=1}^{n}t_i {}\leq{} \epsilon^2$ directly, you can associate with it the penalty function $\psi(t) = \max\{\sum_{i=1}^{n}t_i -\epsilon^2, 0\}$ and define the modified cost function
$$ \tilde{f}(x,t; \lambda) = f(x) +\lambda \psi(t). $$
This is a Lipschitz continuous and you can apply a penalty method on top of your algorithm (or an augmented Lagrangian method - you have to see which one works better). Simply take $\lambda \to \infty$ and stop once $\psi(t)$ becomes sufficiently small.
Let me just clarify that for each $\lambda$ you will have to solve the following problem
\begin{align} &\operatorname*{Minimize}_{x,t} \tilde{f}(x,t; \lambda) \\ &\text{subject to } |x_i| \leq t_i, \text{ for all } i =1,\ldots, n \end{align}
Update 1. Under the stronger assumption that $f$ is continuously differentiable with Lipschitz gradient, it turns out the the original problem is equivalent to the unconstrained minimization problem
$$ \operatorname*{Minimize}_{x} f(x) - \frac{\gamma}{2}\|\nabla f(x)\|^2 + \frac{\gamma}{2}\operatorname{dist}^2_D(x - \gamma \nabla f(x))^2,\tag{2}\label{2} $$
where $D$ is the set of box constraints (I'm using the same notation as the paper you cited), and $\operatorname{dist}^2_D$ is the squared distance from $D$, which is easy to compute. The cost function in \eqref{2} is known as the forward-backward envelope of the original problem (see for example this paper).
If $f$ is just Lipschitz the above does not apply. Generally speaking, this is too weak an assumption in optimization (with the exception of global optimization where people resort to branch-and-bound-type methods).
Update 2. You can use a homeomorphism between $B_2 := \{x\in\mathbb{R}^n: \|x\|_2 \leq 1\}$ and $B_\infty := \{x\in\mathbb{R}^n: \|x\|_\infty \leq 1\}$, i.e., a function $\phi: B_2\to B_{\infty}$ such as
$$ \phi: B_{\infty} \ni x \mapsto \phi(x) = \frac{\|x\|_\infty}{\|x\|_2}x \in B_2 $$
for $x\neq 0$ and $\phi(0) = 0$
The inverse is
$$ \phi^{-1}: B_2\ni z \mapsto \phi^{-1}(z) = \frac{\|z\|_2}{\|z\|_\infty}z \in B_{\infty} $$
for $z\neq 0$ and $\phi^{-1}(0) = 0$.
The constraints in your problem can be written as $\frac{x}{\epsilon} \in B_2$. Define $u = \phi^{-1}(x/\epsilon)$. Then $\frac{x}{\epsilon} = \phi(u) \Leftrightarrow x = \epsilon \phi(u)$. The constraints become
\begin{align} \frac{x}{\epsilon} \in B_2 \Leftrightarrow{}&{}\frac{x}{\epsilon} \in \phi (B_\infty) \\ \Leftrightarrow{}&{}\phi(u) \in \phi (B_\infty) \\ \Leftrightarrow{}&{}u \in B_{\infty} \end{align}
and the optimization problem becomes
\begin{align} &\operatorname*{Minimize}_u f(\epsilon \phi(u)) \\ &\text{subject to } -1 \leq u \leq 1 \end{align}
Once you find an optimal solution $u^\star$ (if one exists), you can retrieve $x^\star = \epsilon \phi(u^\star)$.