LASSO – Comprehensive Overview of Group Lasso Derivation in Regularization

lassoregularization

I've been reading the paper of group lasso, "Model selection and estimation in regression with
grouped variables". http://www.stat.washington.edu/courses/stat527/s13/readings/yuanlin07.pdf

In page 53 of the above paper, I don't know how to obtain Eqn 2.3 and derive Eqn. 2.4 from Eqn. 2.2:
enter image description here

The following is my derivation, from Eqn. 2.2:
$$
-X_j^TY+X_j^TX_j\beta_j+\sum_{i\neq j}X_j^TX_i\beta_i + \frac{\lambda \beta_j \sqrt{p_j}}{\|\beta_j\|}=0
$$
then, we have
$$
\left( X_j^TX_j+\frac{\lambda \sqrt{p_j}}{\|\beta_j\|} \right)\beta_j=X_j^T(Y-\sum_{i\neq j}X_i^T\beta_i)=S_j
$$
So,
$$
\beta_j={\left( X_j^TX_j+\frac{\lambda\sqrt{p_j}}{\|\beta_j\|} \right)}^{-1}S_j
$$
The result above has a little difference from Eqn. 2.4 in the paper. Could you please indicate what's wrong with my derivation? In addition, in another document http://statweb.stanford.edu/~tibs/ftp/sparse-grlasso.pdf, the result of following is also different.

enter image description here

UPDATED: Yes! The first problem is solved.
$$
S_j=X_j^T(Y-X\beta_{-j})=X_j^Tr_j
$$
then,
$$
\|\beta_j\|=\|S_j\|{\left( 1+\frac{\lambda\sqrt{p_j}}{\|\beta_j\|} \right)}^{-1}
$$
Then, we could get
$$
\|\beta_j\|=\|S_j\|-\lambda\sqrt{p_j}
$$
then re-substitute into the formulation above above, then we get
$$
\beta_j={\left( 1-\frac{\lambda\sqrt{p_j}}{\|S_j\|} \right)}S_j
$$
But, where is the $+$ in the above formula?

Best Answer

the derivation of L2 norm is follow:
1. $\frac{\beta_j}{\|\beta_j\|}$ when $\beta_j \ne 0 $.
2. any vector with $ \| \beta_j \| \le 1 $ when $beta_j = 0$.
So when combing these two formula together, you can get the plus sign in the formula.