[Math] How to convert constrainted optimization problem to an unconstrained one

lagrange multiplieroptimization

Suppose I have an optimization problem

$$ \mathbf{w}^* = \max_{\mathbf{w}} \sum_{i=1}^{N} (y_i – \mathbf{w}\cdot\mathbf{x}_i)^2 $$
$$\text{subject to}$$
$$\sum_{j=1}^m w_j^2 \leq \tau$$

I want to show this can be written as

$$ \mathbf{w}^* = \max_{\mathbf{w}} \sum_{i=1}^{N} (y_i – \mathbf{w}\cdot\mathbf{x}_i)^2 + \lambda \sum_{j=1}^m w_j^2$$

What is the basic intuition behind this?

I have done basic Lagrangian multiplier problems like

$$\max_{\mathbf{x}} U(\mathbf{x})$$
$$\text{subject to}$$
$$\mathbf{x}\cdot \mathbf{w} = \tau$$

In those cases, I have written a Lagrangian like

$$L(\mathbf{x},\lambda) = U(\mathbf{x}) + \lambda (m – \mathbf{x}\cdot \mathbf{w})$$

But there's two issues:

  1. A Lagrangian isn't itself something I minimize, correct? I am trying to rewrite my constrained problem as an unconstrained problem.
  2. There is an inequality, not an equality. I assume the Kuhn-Tucker conditions are relevant to this, but I am not sure how.

Best Answer

Introducing the slack variable $\epsilon$ you can write the unconstrained equivalent optimization problem with the Lagrangian

$$ L(\mathbf{w},\lambda,\epsilon) = \sum_{i=1}^{N} (y_i - \mathbf{w}\cdot\mathbf{x}_i)^2 + \lambda \left(\sum_{j=1}^m w_j^2-\tau + \epsilon^2\right) $$

NOTES

1) Solving the lagrangian we will have it's stationary points which should be qualified.

2) Analyzing the the values for $\epsilon^*, \lambda^*$ we can conclude the K-T conditions.

Related Question