Norm constrained least square minimization with an additional single linear equality constraint: A quadratically constrained quadratic program (QCQP)

constraintsconvex optimizationleast squaresnonlinear optimizationqcqp

Given is the following QCQP problem:
\begin{align}
&&\min_{\mathbf{x} \in \mathbb{C}^n} &\|A \mathbf{x} – \mathbf{b}\|^2,\tag{1}\label{1}\\
\text{ subject to:}&&&\\
&&\|\mathbf{x}\|^2 &\leq \alpha^2,\tag{C1}\label{C1}\\
&&\mathbf{c}^{\dagger} \mathbf{x} &= \gamma,\tag{C2}\label{C2}
\end{align}

with:

  • $A \in \mathbb{C}^{m\times n}$, with $m, n \in \mathbb{N}: m \geq n$: $\ A^{\dagger} A$ positive definite,
  • $\mathbf{b} \in \mathbb{C}^m,\mathbf{c} \in \mathbb{C}^n: 0< \|\mathbf{c}\| < \infty$,
  • $\gamma \in \mathbb{C}: |\gamma| < \infty,\, \alpha \in \mathbb{R}: |\gamma|^2/\|\mathbf{c}\|^2 \leq \alpha^2 < \infty$,
  • $\|.\|$$2$-norm,
  • ${}^\dagger$ – Hermitian adjoint, the combined operation of complex conjugation and transposition.

Obviously, there exists no solution for $\alpha^2 < |\gamma|^2/\|\mathbf{c}\|^2$.
For the solution of the only norm constraint problem, i.e. without the constraint \eqref{C2}, see here and here.

With the definitions $f(\mathbf{x}):=\|A \mathbf{x} – \mathbf{b}\|^2$, $h(\mathbf{x}):=\mathbf{c}^{\dagger}\mathbf{x} – \gamma$ and $g(\mathbf{x}):=\|\mathbf{x}\|^2 – \alpha^2$, we can write the problem in the standard form:
\begin{align}
&&\min_{\mathbf{x} \in \mathbb{C}^n} &f(\mathbf{x}),\tag{1}\label{1a}\\
\text{ subject to:}&&&\\
&&g(\mathbf{x}) &\leq 0,\tag{C1}\label{C2a}\\
&&h(\mathbf{x})& = 0.\tag{C2}\label{C1a}
\end{align}

The Lagrangian is given by
$$L(\mathbf{x},\mu, \lambda) = f(\mathbf{x}) + \mu g(\mathbf{x}) + \lambda h(\mathbf{x}),\tag{2}\label{2}$$
with $(\mu, \lambda)$ KKT-multipliers.
In the following we abbreviate $\mathbf{y}:=A^{\dagger}\mathbf{b}$ and denote with $I$ the $n\times n $ identity and with $\mathbf{0}$ a length $n$ column vector with all zero entries.
Solving the linear matrix equation
$$
\begin{pmatrix}
A^{\dagger}A + \mu I & \mathbf{c}\\
\mathbf{c}^{\dagger} & 0
\end{pmatrix}
\begin{pmatrix}
\mathbf{x}(\mu)\\
\lambda(\mu)
\end{pmatrix}
=
\begin{pmatrix}
\mathbf{y}\\
\gamma
\end{pmatrix},\tag{3}\label{3}
$$

we obtain the $\mu$-parametric solution
$$
\mathbf{x}(\mu) =
\begin{pmatrix}
I &\mathbf{0}
\end{pmatrix}
\begin{pmatrix}
A^{\dagger}A + \mu I & \mathbf{c}\\
\mathbf{c}^{\dagger} & 0
\end{pmatrix}^{-1}
\begin{pmatrix}
\mathbf{y}\\
\gamma
\end{pmatrix}.\tag{4}\label{4}
$$

Using the formula for block matrix inversion,
$$
\begin{pmatrix}
G & \mathbf{v}\\
\mathbf{v}^{\dagger} &0
\end{pmatrix}^{-1}
=
\begin{pmatrix}
G^{-1} – \frac{G^{-1}\mathbf{v}\mathbf{v}^{\dagger}G^{-1}}{\mathbf{v}^{\dagger}G^{-1}\mathbf{v}} & \frac{G^{-1}\mathbf{v}}{\mathbf{v}^{\dagger}G^{-1}\mathbf{v}}\\
\frac{\mathbf{v}^{\dagger} G^{-1}}{\mathbf{v}^{\dagger}G^{-1}\mathbf{v}} & -\frac{1}{\mathbf{v}^{\dagger}G^{-1}\mathbf{v}}
\end{pmatrix},\tag{5}\label{5}
$$

an explicit expression for the $\mu$-parametric solution can be written as
$$
\mathbf{x}(\mu) = (A^{\dagger}A + \mu I)^{-1}\left(\mathbf{y} – w(\mu)\mathbf{c}\right),\tag{6}\label{6}
$$

where
$$
w(\mu):=\frac{\mathbf{c}^{\dagger} (A^{\dagger}A + \mu I)^{-1} \mathbf{y} – \gamma}{\mathbf{c}^{\dagger} (A^{\dagger}A + \mu I)^{-1}\mathbf{c}}.\tag{7}\label{7}
$$

The solution of \eqref{6} fulfills \eqref{C2}. The remaining task is to find $\mu^*$:
$$
f(\mathbf{x}(\mu^*)) = \min_{\mu:\, g(\mathbf{x}(\mu)) \leq 0} f(\mathbf{x}(\mu)).\tag{8}\label{8}
$$

Questions

(Q1) Is $\|\mathbf{x}(\mu)\|^2$ monotonically decreasing with increasing $|\mu|$?

(Q2) Is $\mu^* \geq 0$?

(Q3) Is $\mu^* = \inf\, \{\mu \geq 0 \mid g(\mathbf{x}(\mu)) \leq 0\}$?

Best Answer

Answers

(Q1)

$\|\mathbf{x}(\mu)\|^2$ is monotonically decreasing without jump discontinuities for $\mu \geq 0$ with $\lim_{\mu \to \infty}\|\mathbf{x}(\mu)\|^2 = |\gamma|^2/\|\mathbf{c}\|^2$. If for a fixed $\mu: 0 \leq \mu < \infty$ the solution is $\mathbf{x}^* = \gamma \mathbf{c} /\|\mathbf{c}\|^2$, then this is the solution for all non-negative values of $\mu$. For $\mu <0$ monotonicity holds not on the whole domain of negative real numbers and jump discontinuities can appear.

(Q2)

Yes, see below.

(Q3)

Yes and the global optimal solution is given by $\mathbf{x}^* = \mathbf{x}(\mu^*)$, with $\mathbf{x}(\mu)$ from (6). Although this solution also holds for the case $\alpha^2 = |\gamma|^2/\|\mathbf{c}\|^2$, where the only allowed solution is the one for $\mu \to \infty$, in practice no limit must be taken due to the fact that the solution is already determined to be $\mathbf{x}^* = \gamma \mathbf{c} /\|\mathbf{c}\|^2$.

Strategy for numerical solution

For $\alpha^2 = |\gamma|^2/\|\mathbf{c}\|^2$, no calculation is necessary since the only allowed solution is $\mathbf{x}^* = \gamma \mathbf{c} /\|\mathbf{c}\|^2$.
For $\alpha^2 > |\gamma|^2/\|\mathbf{c}\|^2$, the numerical procedure to calculate the optimal solution simplifies with the help of the answers of (Q1) and (Q2) to calculate $\mathbf{x}(\mu)$ according to (6), starting with $\mu=0$ and increase $\mu$ until (C1) holds.
No solution exists obviously for $\alpha^2 < |\gamma|^2/\|\mathbf{c}\|^2$.

Proofs/Reasons

(Q1)

We first investigate the asymptotic solution for large $\mu$. As a first step we divide the first row of (3) by $\mu$: $$ \left(A^{\dagger} A/\mu + I\right) \mathbf{x}(\mu) + \mathbf{c} \lambda(\mu)/\mu = \mathbf{y}/\mu.\tag{A1}\label{A1} $$ Multiplying \eqref{A1} from the left with $\mathbf{c}^{\dagger}$ and using both, the constraint (C2) and the fact that we look for solutions with bounded norm only, we obtain the asymptotic expression $$ \lim_{\mu \to \infty}\lambda(\mu)/\mu = -\gamma/\|\mathbf{c}\|^2\tag{A2}.\label{A2} $$ With \eqref{A2} in the asymptotic limit of \eqref{A1}, we find $$ \lim_{\mu \to \infty} \mathbf{x}(\mu) = \gamma \mathbf{c} /\|\mathbf{c}\|^2=:\mathbf{x}_{\infty},\tag{A3}\label{A3} $$ and thus $$ \lim_{\mu \to \infty} \|\mathbf{x}(\mu)\|^2 = |\gamma|^2/\|\mathbf{c}\|^2.\tag{A4}\label{A4} $$ To investigate the monotonicity of $\|\mathbf{x}(\mu)\|^2$ we start with differentiating (3) w.r.t. $\mu$: The first row yields $$ \mathbf{x}(\mu) + (A^{\dagger}A + \mu I) \mathbf{x}'(\mu) + \mathbf{c}\lambda'(\mu) = 0,\tag{A5}\label{A5} $$ and the second row the trivial condition $\mathbf{c}^{\dagger} \mathbf{x}'(\mu) = 0$. We use for the function derivative w.r.t. $\mu$ the abbreviation $u'(\mu) \equiv \rm{d}\, u(\mu)/\rm{d}\mu$. Multiplying \eqref{A5} with $\mathbf{c}^{\dagger}$ from left we obtain $$ \gamma + \mathbf{c}^{\dagger} A^{\dagger}A\mathbf{x}'(\mu) + \|\mathbf{c}\|^2\lambda'(\mu) = 0,\tag{A6}\label{A6} $$ which yields $$ \lambda'(\mu) = -\dfrac{ \gamma + \mathbf{c}^{\dagger}A^{\dagger}A \mathbf{x}'(\mu)}{\|\mathbf{c}\|^2}.\tag{A7}\label{A7} $$ Using \eqref{A7} in \eqref{A5} we get a conditional equation for $\mathbf{x}'(\mu)$: $$ \left(\left(I - \dfrac{\mathbf{c}\mathbf{c}^{\dagger}}{\|\mathbf{c}\|^2}\right)A^{\dagger}A + \mu I\right) \mathbf{x}'(\mu) = \frac{\gamma \mathbf{c}}{\|\mathbf{c}\|^2} - \mathbf{x}(\mu).\tag{A8}\label{A8} $$ On the left hand side the orthogonal projector $P:=I - \mathbf{c}\mathbf{c}^{\dagger}/\|\mathbf{c}\|^2$ appears that maps elements of $\mathbb{C}^n$ to $\mathcal{U} = \{\mathbf{x} \in \mathbb{C}^n \mid \mathbf{c}^{\dagger}\mathbf{x} = 0\}\subset \mathbb{C}^n$, the subspace orthogonal to the one-dimensional subspace along $\mathbf{c}$. Multiplying \eqref{A8} with $P$ from left we get: $$ \left(PA^{\dagger}A + \mu P \right) \mathbf{x}'(\mu) = -P\mathbf{x}(\mu).\tag{A9}\label{A9} $$ Since $\mathbf{c}^{\dagger} \mathbf{x}'(\mu) = 0$, which is equivalent to $P\mathbf{x}'(\mu) = \mathbf{x}'(\mu)$, we can further write $$ \left(PA^{\dagger}AP + \mu P \right) \mathbf{x}'(\mu) = -P\mathbf{x}(\mu),\tag{A10}\label{A10} $$ which is a linear equation on $\mathcal{U}$ only that completely determines $\mathbf{x}'(\mu)$. We expand \eqref{A10} in an orthonormal basis of $\mathcal{U}$ and use the following notation: $\mathbf{x}_{\mathcal{U}}(\mu)$, $\mathbf{x}'_{\mathcal{U}}(\mu)$ for the representation of $P\mathbf{x}(\mu)$, $\mathbf{x}'(\mu)$ in $\mathcal{U}$, $I_{\mathcal{U}}$ for the identity in $\mathcal{U}$ and $S_{\mathcal{U}}$ for the representation of $PA^{\dagger}AP$ in $\mathcal{U}$. We can write $$ \mathbf{x}'_{\mathcal{U}} = -\left(S_{\mathcal{U}} + \mu I_{\mathcal{U}}\right)^{-1}\mathbf{x}_{\mathcal{U}}(\mu),\tag{A11}\label{A11} $$ and it follows \begin{align} \frac{\rm{d}}{\rm{d}\mu}\|\mathbf{x}(\mu)\|^2 & = \mathbf{x}'(\mu)^{\dagger}\mathbf{x}(\mu) + \mathbf{x}(\mu)^{\dagger}\mathbf{x}'(\mu)\tag{A12}\label{A12}\\ &=\mathbf{x}'(\mu)^{\dagger}P P \mathbf{x}(\mu) + \mathbf{x}(\mu)^{\dagger}PP\mathbf{x}'(\mu)\\ &=\mathbf{x}_{\mathcal{U}}'(\mu)^{\dagger}\mathbf{x}_{\mathcal{U}}(\mu) + \mathbf{x}_{\mathcal{U}}(\mu)^{\dagger}\mathbf{x}_{\mathcal{U}}'(\mu)\\ &=-2\mathbf{x}_{\mathcal{U}}(\mu)^{\dagger}\left(S_{\mathcal{U}} + \mu I_{\mathcal{U}}\right)^{-1}\mathbf{x}_{\mathcal{U}}(\mu).\tag{A13}\label{A13} \end{align} It can be readily seen from its definition that $S_{\mathcal{U}}$ is positive definite, hence $\left(S_{\mathcal{U}} + \mu I_{\mathcal{U}}\right)^{-1}$ is positive definite for $\mu \geq 0$. Since $A^{\dagger}A$ is also positive definite we obtain from (6) together with (7) that $\mathbf{x}(\mu)$ is not diverging for $\mu\geq 0$. Therefore, from \eqref{A13} we obtain $\frac{\rm{d}}{\rm{d}\mu}\|\mathbf{x}(\mu)\|^2 \leq 0$ for $\mu \geq 0$, that is $\|\mathbf{x}(\mu)\|^2$ is monotonically decreasing without jump discontinuities for $\mu \geq 0$. However, the monotonicity is not strict because $\rm{d}\|\mathbf{x}(\mu)\|^2/\rm{d}\mu$ vanishes when $\mathbf{x}_{\mathcal{U}}(\mu) = 0$. The only solution compatible with both constraints (C1) and (C2) that yields $\mathbf{x}_{\mathcal{U}}(\mu) = 0$ is $\mathbf{x}(\mu) = \gamma\mathbf{c}/\|\mathbf{c}\|^2 = \mathbf{x}_{\infty}$. Since this is also the asymptotic solution from \eqref{A3}, we have $\lim_{\mu \to \infty} \rm{d}\|\mathbf{x}(\mu)\|^2/\rm{d}\mu = 0$. Moreover, if (6) yields $\mathbf{x}_{\infty}$ for $\mu < \infty$, it must be the solution for all $\mu$.

For $\mu < 0$, $\left(S_{\mathcal{U}} + \mu I_{\mathcal{U}}\right)^{-1}$ is not guaranteed to be positive definite and hence monotonicity is not given on the whole domain of negative real numbers. Moreover, jump discontinuities can appear where $\mu$ matches eigenvalues of $A^{\dagger}A$ or $S_{\mathcal{U}}$.

(Q2)

We have to distinguish two cases:

  • $\alpha^2 > |\gamma|^2/\|\mathbf{c}\|^2$:
    With the monotonicity of $\|\mathbf{x}(\mu)\|^2$ for $\mu \geq 0$ and \eqref{A4}, we find a $\mu^\star: g(\mathbf{x}(\mu^\star)) < 0$. Therefore, Slater's condition holds which guarantees strong duality of the convex optimization problem and thus the existence of a KKT-point $(\mathbf{x}^*,\mu^*, \lambda^*)$, where $\mathbf{x}^*$ is a local optimum for the optimization problem and $(\mu^*,\lambda^*)$ for the corresponding dual problem. Due to the convexity and strong duality of the problem the following KKT-conditions are sufficient conditions for the global optimum: \begin{align} g(\mathbf{x}^*) &\leq 0,\tag{A14}\label{A14}\\ h(\mathbf{x}^*) &= 0,\tag{A15}\label{A15}\\ \mu^* &\geq 0,\tag{A16}\label{A16}\\ \mu^*g(\mathbf{x}^*) &= 0.\tag{A17}\label{A17} \end{align} \eqref{A16} answers the question for this case.

  • $\alpha^2 = |\gamma|^2/\|\mathbf{c}\|^2$:
    Slater's condition does not hold, however, the only allowed solution is $\mathbf{x} = \mathbf{x}_{\infty}$, the asymptotic solution. Therefore, we have $\mu^* \to \infty$.

For both cases we get $\mu^* \geq 0$.

(Q3)

For $\mathbf{x}(\mu)$ from (6), \eqref{A15} holds. The remaining task is to find the optimal $\mu^* \geq 0$ such that the remaining KKT-conditions hold. To satisfy \eqref{A17}, we require $\mu^* = 0$ if $g(\mathbf{x}(\mu^*=0)) < 0$ and $\mu^* >0$ otherwise. For the latter case we require $\|\mathbf{x}(\mu^*)\|^2 = \alpha^2$ such that (A14) holds. Due to the fact that $\|\mathbf{x}(\mu)\|^2$ is monotonically decreasing for $\mu \geq 0$, we find $\mu^* = \inf\, \{\mu \geq 0 \mid g(\mathbf{x}(\mu)) \leq 0\}$ and the global optimal solution is given by $\mathbf{x}^* = \mathbf{x}(\mu^*)$.