Introduction to Statistical Learning Eq. 6.12 and 6.13

lassomachine learningridge regression

Can someone please explain me how the optimization of 6.12 leads to 6.14 and the optimization of 6.13 leads to 6.15?

Best Answer

For the first equation, it's the result of zero gradient; $$ \begin{aligned} S &= \sum_{j=1}^p (y_j-\beta_j)^2 +\lambda\sum_{j=1}^p\beta_j^2\\ \end{aligned} $$ at extrema, $$ \begin{aligned} \frac{\partial S}{\partial \beta_j} &=0\\ -2(y_j -\beta_j) +2\lambda\beta_j &= 0\\ \beta_j &= \frac{y_j}{1+\lambda}. \end{aligned} $$

I think you should be able to derive the other expression using the same technique shown above and use the fact that $$ \vert \beta_j \vert = \begin{cases} \beta_j \ \text{if} \ \beta_j > 0\\ -\beta_j \ \text{if} \ \beta_j < 0\end{cases}. $$

Best Answer

Related Solutions

Solved – PCA: How to the first principal component both maximize variance AND define the line that most closely fits the data

Introduction to Statistical Learning Eq. 4.32

Related Question