[Math] Minimizing $L_1$ Regularization

I have given a high dimensional input $x \in \mathbb{R}^m$ where $m$ is a big number. Linear regression can be applied, but in generel it is expected, that a lot of these dimensions are actually irrelevant.

I ought to find a method to model the function $y = f(x)$ and at the same time uncover which dimensions contribute to the output. The overall hint is to apply the $L_1$-norm Lasso regularization.

$$L^{\text lasso}(\beta) = \sum_{i=1}^n (y_i – \phi(x_i)^T \beta)^2 + \lambda \sum_{j = 1}^k | \beta_j |$$

Minimizing $L^{\text lasso}$ is in general hard, for that reason I should apply gradient descent. My approach so far is the following:

In order to minimize the term, I chose to compute the gradient and set it $0$, i.e.

$$\frac{\partial}{\partial \beta} L^{\text lasso}(\beta) = -2 \sum_{i = 1}^n \phi(x_i)(y_i – \beta)$$

Since this cannot be computed simply, I want to apply the gradient descent here. The gradient descent takes a function $f(x)$ as input, so in place of that will be put the $L(\beta)$ function.

My problems are:

is the gradient I computed correct? I left out the regularizationt erm in the end, which I guess is a problem, but I also have problems to compute the gradient for this one

In addition I may replace $| \beta_j |$ by the function $l(x)$, which is $l(x) = x – \varepsilon/2$ if $|x| \geq \epsilon$, else $x^2/(2 \varepsilon)$
one step in the gradient descent is: $\beta \leftarrow \beta – \alpha \cdot \frac{g}{|g|}$, where $g = \frac{\partial}{\partial \beta}f(x))^T$

I don't get this one, when I transpose the gradient I have computed the term cannot be resolved due to dimension mismatch

[Math] Minimizing $L_1$ Regularization

Best Answer

Related Question

Best Answer

Related Solutions

[Math] Gradient vector function using sum and scalar

Formulate ordinary least squares regression in component formalism

Related Question