[Math] Linear objective function with quadratic constraints

convex optimizationoptimizationqclpregressionstatistics

The context is ordinary multivariate regression with $k$ (>1) regressors, i.e. $Y = X\beta + \epsilon$, where
$Y \in \mathbb{R}^{n \times 1}$ vector of predicted variable,
$X \in \mathbb{R}^{n \times (k+1)}$ matrix of regressor variables(including ones in the first column)
$\beta \in \mathbb{R}^{(k+1) \times 1}$ vector of coefficients, including intercept.

Say, I have already estimated $\beta$ as $\hat{\beta} = (X'X)^{-1} X'Y.$

I have to solve the following program:

minimize $f(B) = L\beta$ ( $L$ is a fixed $1\times (k+1)-$vector )
such that:
$[(\beta-\hat{\beta})' \cdot X'X \cdot (\beta-\hat{\beta})] / [(Y – X\hat{\beta})' (Y – X\hat{\beta}) ]$ is less than a given value $c$.

Note that this is a linear optimization program with respect to $\beta$ with quadratic constraints.

I don't understand how we can solve this optimization – I was going through some online resources, each of which involve manually computing gradients of the objective as well as constraint functions – which I want to avoid (at least manually doing this).

Can you please help solve this optimization problem in R ? The inputs would be:
$X$, $Y$,
$\hat{\beta}$,
$L$ and
$c$

Please let me know if any further information is required – the set-up is pretty general.

Best Answer

We have the quadratically constrained linear program (QCLP) in $\beta \in \mathbb R^p$

$$\begin{array}{ll} \text{minimize} & \langle \mathrm w, \beta \rangle\\ \text{subject to} & \| \mathrm X (\beta - \hat{\beta}) \|_2^2 \leq \gamma^2 \| \mathrm y - \mathrm X \hat{\beta} \|_2^2\end{array}$$

where $\gamma > 0$. Let

$$r := \gamma \| \mathrm y - \mathrm X \hat{\beta} \|_2$$

and

$$\mathrm z := \mathrm X (\beta - \hat{\beta})$$

If $\mathrm X$ has full column rank, then

$$\beta = \hat{\beta} + (\mathrm X^{\top} \mathrm X)^{-1} \mathrm X^{\top} \mathrm z$$

Hence, we have the following QCLP in $\mathrm z \in \mathbb R^n$

$$\begin{array}{ll} \text{minimize} & \langle \mathrm w, (\mathrm X^{\top} \mathrm X)^{-1} \mathrm X^{\top} \mathrm z \rangle\\ \text{subject to} & \| \mathrm z \|_2^2 \leq r^2\end{array}$$

which can be rewritten as

$$\begin{array}{ll} \text{minimize} & \langle \tilde{\mathrm w}, \mathrm z \rangle\\ \text{subject to} & \| \mathrm z \|_2^2 \leq r^2\end{array}$$

where $\tilde{\mathrm w} := \mathrm X (\mathrm X^{\top} \mathrm X)^{-1} \mathrm w$. Since the objective function is linear and the feasible region is convex, the minimum should be attained on the boundary of the Euclidean ball of radius $r$ centered at the origin. Hence, let us solve the following equality-constrained (non-convex) QCLP instead

$$\begin{array}{ll} \text{minimize} & \langle \tilde{\mathrm w}, \mathrm z \rangle\\ \text{subject to} & \| \mathrm z \|_2^2 = r^2\end{array}$$

We define the Lagrangian

$$\mathcal L (\mathrm z, \lambda) := \langle \tilde{\mathrm w}, \mathrm z \rangle - \frac{\lambda}{2} (\mathrm z^{\top} \mathrm z - r^2)$$

which produces the minimizer

$$\mathrm z_{\min} := - r \, \frac{\tilde{\mathrm w} \,\,}{\| \tilde{\mathrm w} \|_2} = - r \, \frac{\mathrm X (\mathrm X^{\top} \mathrm X)^{-1} \mathrm w \,\,}{\| \mathrm X (\mathrm X^{\top} \mathrm X)^{-1} \mathrm w \|_2}$$

Thus, the minimizer is

$$\boxed{\beta_{\min} := \hat{\beta} + (\mathrm X^{\top} \mathrm X)^{-1} \mathrm X^{\top} \mathrm z_{\min} = \hat{\beta} - r \, \frac{(\mathrm X^{\top} \mathrm X)^{-1} \mathrm w \,\,}{\| \mathrm X (\mathrm X^{\top} \mathrm X)^{-1} \mathrm w \|_2}}$$

and the minimum is

$$\begin{array}{rl} \langle \mathrm w, \beta_{\min} \rangle &= \langle \mathrm w, \hat{\beta} \rangle - r \, \dfrac{\mathrm w^{\top} (\mathrm X^{\top} \mathrm X)^{-1} \mathrm w \,\,}{\| \mathrm X (\mathrm X^{\top} \mathrm X)^{-1} \mathrm w \|_2}\\\\ &= \langle \mathrm w, \hat{\beta} \rangle - r \, \dfrac{\mathrm w^{\top} (\mathrm X^{\top} \mathrm X)^{-1} \mathrm w}{\sqrt{\mathrm w^{\top} (\mathrm X^{\top} \mathrm X)^{-1} \mathrm w}}\\\\ &= \langle \mathrm w, \hat{\beta} \rangle - r \sqrt{\mathrm w^{\top} (\mathrm X^{\top} \mathrm X)^{-1} \mathrm w}\end{array}$$

Related Question