Find pairs of linearly associated linear combinations

convex optimizationlinear programmingmixed-integer programmingoptimizationquadratic programming

I am trying to figure out if the following problem can be formulated using linear or quadratic programming:
consider a set of $n$ normalized vectors $V = \{V_1,..,V_n\}$ with $V_i \in \mathbb{R}^d$. The goal is to find (sparse) pair(s) of linear combinations with $a_1V_i + .. + a_lV_l$ and $b_1V_j + .. + b_mV_m$ such that

  1. the linear combinations in each pair are linearly associated/correlated.
  2. the vectors in the first linear combination do not occur in the second linear combination (what I tried to indicate by using $i, l$ and $j, m$ when numbering the vectors in the aforementioned linear combinations. Sorry if this is weird use of notation.)
  3. $l + m < n$ (or $l < n$ and $m < n$), i.e. the linear combinations are sparse/not all vectors in $V$ participate

The values
$a_1,..,a_l$ and $b_1,..,b_m$ are coefficients/weights with $a_i, b_i \in [-1,1]$.

My attempt at modeling this problem is to formulate it as an optimization problem and set $a_1V_1 + .. + a_nV_n + r_1 = b_1V_1 + .. + b_nV_n + r_2$, which can be encoded as $d$ linear equality constraints (one for each dimension). Here, $r_1, r_2 \in \mathbb{R}^d$ are vectors that serve as slack variables (i.e. error/residual terms for when the values in the linear combinations of the vectors do not match exactly).

The objective then becomes the minimization of $\sum_{i=1}^d (r_1 + r_2)$.
Since the objective and constraints are linear in this case, I can formulate this as a linear programming problem. For obtaining sparse combinations, I can add a regularization term. However, modeling the fact that one vector cannot occur in both linear combinations requires me to introduce a binary variable that sets either $a_i = 0$ or $b_i = 0$.
This turns the problem into a mixed-integer LP which works but makes it hard to scale to larger $n$.

I wonder if there's a way to reformulate it as an LP or QP? The problem seems to me close to regression (least absolute deviation for LP formulation, least squares for QP formulation) except that there is no obvious starting $y$.

Best Answer

You cannot model this as an LP or QP with the mutual exclusion provision (item 2) as a hard constraint. Suppose, for instance, that one feasible solution has only a_1 and b_2 positive, and another has only a_2 and b_1 positive. Since the feasible region of an LP or QP is convex, the midpoint of those solutions would also be feasible, and it would have a_1, a_2, b_1 and b_2 all positive.

That leaves the question of whether you could add a term to the objective function to discourage violations of the mutual exclusion rule. I don't see any way to do that, but I don't have a ready proof that it is impossible.

Related Question