Minimize the number of non-zero rows

integer programmingmatrix equationsmixed-integer programmingnonlinear optimizationoptimization

I am trying to formulate a shipping problem as a mathematical optimization one, and I'm having some trouble determining my optimization objective (cost function). Without getting too far into the details of the particular problem, the variable I am trying to optimize over is an $K\times N$ matrix X of non-negative integers. I have figured out some constraints on this matrix, namely $$ X^T \vec{1} = \vec{d} $$ and $$ X \preceq A $$ where $A$ is a non-negative $K\times N$ matrix of integers and $\vec{d}$ is a vector of length $N$ (of integers), and $\preceq$ indicates the usual component-wise matrix inequality. These are based on the particular problem at hand, and are nicely linear. This means I can plug these constraints into some numerical optimizer by simply "linearizing" the matrix $X$.

More complicatedly, as an objective I am trying to minimize a weighted count of non-zero rows of $X$. As such, I would like some sort of function rowNonEmpty which would take in a matrix row and return 0 if the row is all zeros and 1 otherwise. My optimization objective would then be something like
$$ \min_{X} \; \sum_{j=1}^K w_j \cdot \text{rowNonEmpty}\left(X_{j*}\right) $$ where $w_j$ is a weighting factor for row $j$ and $X_{j*}$ is the $j$th row of $X$.

Does anyone have some suggestions for how to properly frame this problem and the rowNonEmpty function? My first thought was that this is similar to some sort of weighted $L_1$ norm minimization, but not really because we are looking at non-zero rows of a matrix and not just vector entries. The first answer here suggests adding binary variables, but this is again for a more standard problem involving only a vector.

Best Answer

For each row $i$, introduce a binary variable $y_i$ to indicate whether the row is nonempty. The objective is to minimize $\sum_i w_i y_i$, and the constraints are: $$X_{i,j} \le A_{i,j} y_i$$ for each row $i$. This formulation enforces the logical implication $X_{i,j} > 0 \implies y_i = 1$, which is sufficient if $w_i \ge 0$.