Efficiently compute rows of matrix which are not positive combinations of other rows

convex optimizationlinear algebranumerical linear algebranumerical methods

Constrained convex optimization is a key component of many workflows/research projects in industry and academia. These problems take the form:

$$ \min_\theta f(\theta) $$
$$ s.t. M \theta \leq b$$

Where $f$ is some convex function, $\theta \in \mathbb{R}^n$, $M$ is a $p \times n$ matrix, and $b \in \mathbb{R}^p$. Many generic solvers for convex optimization problems exist, but their theoretical and practical computational complexity scales in the number of constraints. Being able to reduce the number of constraints, that is being able to reduce the number of rows of $M$ and $b$ is therefore a central question.

The constraints defined by $M$ and $b$ can be reformulated into a single matrix $A$ which uniquely defines the constraints (we can leave the details of this reformulation aside).

The constraint matrix $A$ is called irreducible if no row of $A$ is a positive linear combination of other rows. More formally, let $A_i$ denote the $i$ row of $A$ and let $A_{-i}$ denote the matrix obtained by removing the $i$th row of $A$, then $A_i$ is a positive linear combination of the other rows of $A$ if there is some $\nu \ge 0$ such that
$$ A_i = A_{-i} \nu.$$

Given a matrix $A$, I would like to construct a matrix $R$ which removes the these reducible components of $A$, and a matrix $M$ which so that $$M R = A.$$
Essentially, $R$ is the irreducible version of the matrix constraint, and $M$ allows us to reconstruct $A$ from $R$.

There are reasonably fast algorithms for solving the nonnegative least squares problem, i.e,
$$\beta = \text{nnls}(y, M),$$
where $\beta$ is defined as:
$$ \inf_\beta||y-M\beta ||_2^2$$
$$ \text{s.t.} \beta \geq 0$$
I do not think that what follows this is the best solution, but given the nnls function, a simple algorithm for computing the irreducible set of rows might work like this:

reducible = []
for i in nrow(A):
    # compute the non-negative least squares
    coefs = nnls(A[i], A[-i]) 

    # if the non-negative least squares 
    # recovers A[i] exactly it is reducible
    if A[-i] * coefs == A[i]: 
        reducible.append(i)

M = A[-reducible] # get the irreducible rows

We can then run the nnls from M to the reducible set A[reducible] to compute R.

Is there a faster way to do this? Are there properties of (ir)reducible matrices that could lead to a faster algorithm?

Best Answer

$\newcommand{\R}{\mathrm{I\!R}}\newcommand{\cone}{\mathrm{cone}}$I will give an answer regarding the columns of $A$ (I find it more convenient). If you are interested in the rows of $A$ you can apply the following methodology to $A^\intercal$.

Let $A\in\R^{m\times n}$ and $A = [a_1~a_1~\cdots~a_{n}]$. We define $A_{\setminus i}\in\R^{m\times(n-1)}$ to be $A$ without the $i$-th column.

Then $a_i$ is a conic (positive) combination of the columns of $A_{\setminus i}$ iff there is a vector $x\geq 0$ (the inequality is meant in the element-wise sense) such that

$$ A_{\setminus i}x = a_i. $$

Essentially, we are interested in solving the feasibility problem:

\begin{align} \text{Find }& x\in\R^{n-1} \text{such that:} \\ &x \geq 0, \\ &A_{\setminus i}x = a_i. \end{align}

Now for $I\subseteq \{1, \ldots, n\}$ let us define $A_{\setminus I}$ to be $A$ after we remove the vectors $a_i$ for $i\in I$. For a set of indices $I$ and $j\in\{1, \ldots, n\}$, $j \notin I$ we define the feasibility problem:

\begin{align} \mathbb{P}_{I, j}: \text{Find }& x\in\R^{n-|I|} \text{such that:} \\ &x \geq 0, \\ &A_{\setminus I}x = a_j. \end{align}

We can then apply the following algorithm:

  • $I = \{\}$
  • for $j = 1,\ldots, n$
    • If $\mathbb{P}_{I,j}$ is feasible, $I \gets I \cup \{j\}$
  • Return $A_{\setminus I}$

To solve the feasibility problem $\mathbb{P}_{I,j}$ we can use SCS or SuperSCS. SCS/SuperSCS can determine whether $\mathbb{P}_{I, j}$ are feasible. Both SCS and SuperSCS need (internally) to solve a linear system where the left-hand side is the matrix $$ B_A = \begin{bmatrix} I & A & -b \\ -A^\intercal & I & 0 \\ -b^\intercal & 0 & I \end{bmatrix}. $$ In turn, as shown in [this paper, Section 4.1], this can be facilitated by computing the LDL factorisation of $$ M_A = \begin{bmatrix}I & -A^\intercal\\-A & I\end{bmatrix}. $$

This can be done once; every time we remove a column from $A$, we are essentially removing a column-row pair from $M$. We can reduce the overall complexity of the above algorithm, we can update the corresponding LDL factorisations so we don't have to compute them each time again from scratch.

This leads to the following algorithm:

  • $I = \{\}$
  • Compute the LDL factorisation of $M_{A_{\setminus 1}}$
  • for $j = 1,\ldots, n$
    • Check whether $\mathbb{P}_{I,j}$ is feasible using (Super)SCS and pass the pre-computed the LDL factorisation of $M_{A_{\setminus 1}}$ to (Super)SCS
    • If $\mathbb{P}_{I,j}$ is feasible, $I \gets I \cup \{j\}$
    • If $j > 1$ update the factorisation of $M_{A_{\setminus I}}$
  • Return $A_{\setminus I}$

Note: The idea is that since we need to solve a sequence of optimisation problems (well... feasibility problems) perhaps there are quantities that are computed while solving each individual problem that can be used to facilitate/accelerate the solution of the next one.

Related Question