I'm taking a ML course , recently we were given an assignment , where we were asked to implement a Least Squared Linear Regression with Regularization with P-norm where $1\leq\,P\leq2$ , $p=1$ for lasso and $p=2$ for ridge , now we are asked to implement a generalized solution so it can solve for any value between $1$ and $2$ , Is that really possible ? Currently I'm able to solve the ridge case using gradient descent with Constant step size . But am not getting how to approach the solution if it needs to be solved for any value of $P$ between $1$ and $2$.
The Objective Function:
$$ \frac{1}{2} \left\| A x – b \right\|_{2}^{2} + \lambda \left\| x \right\|_{p} $$
Where $ 1 \leq p \leq 2 $ is given.
Best Answer
There are 2 similar problems:
Problem I
$$ \arg \min_{x} \frac{1}{2} \left\| A x - b \right\|_{2}^{2} + \lambda \left\| x \right\|_{p} $$
Problem II
$$ \arg \min_{x} \frac{1}{2} \left\| A x - b \right\|_{2}^{2} + \lambda \left\| x \right\|_{p}^{p} $$
Solution Problem I
The function is given by:
$$ f \left( x \right) = \frac{1}{2} \left\| A x - b \right\|_{2}^{2} + \lambda \left\| x \right\|_{p} $$
The derivative is given by:
$$ \frac{d}{d x} f \left( x \right) = {A}^{T} \left( A x - b \right) + \lambda p X x $$
Where the matrix $ X $ is a diagonal matrix given by:
$$ {X}_{ii} = \left| {x}_{i} \right|^{p - 2} \frac{\left\| x \right\|_{p}^{ \frac{1}{p} - 1 }}{p} $$
The derivative vanishes at:
$$ x = \left( {A}^{T} A + \lambda p X \right)^{-1} {A}^{T} b $$
Since $ X $ dpends on $ x $ the method to solve it is using the Iterative Reweighted Least Squares (IRLS):
$$ {x}^{k + 1} = \left( {A}^{T} A + \lambda p {X}^{k} \right)^{-1} {A}^{T} b $$
The Code:
Solution Problem II
The function is given by:
$$ f \left( x \right) = \frac{1}{2} \left\| A x - b \right\|_{2}^{2} + \lambda \left\| x \right\|_{p}^{p} $$
The derivative is given by:
$$ \frac{d}{d x} f \left( x \right) = {A}^{T} \left( A x - b \right) + \lambda p X x $$
Where the matrix $ X $ is a diagonal matrix given by:
$$ {X}_{ii} = \left| {x}_{i} \right|^{p - 2} $$
The derivative vanishes at:
$$ x = \left( {A}^{T} A + \lambda p X \right)^{-1} {A}^{T} b $$
Since $ X $ dpends on $ x $ the method to solve it is using the Iterative Reweighted Least Squares (IRLS):
$$ {x}^{k + 1} = \left( {A}^{T} A + \lambda p {X}^{k} \right)^{-1} {A}^{T} b $$
Where:
$$ {X}_{ii}^{k} = \left| {x}_{i}^{k} \right|^{p - 2} $$
The Code is given by:
The code is available (Including validation by CVX) at my StackExchange Mathematics Q2403596 GitHub Repository.