# Binary matrix multiplication optimization problem

binary-programmingoptimization

I am looking for pointers to and names of computational approaches to solve a binary matrix optimization problem of:

$$minimize: ||\mathbf{X}\mathbf{Y} – \mathbf{T}||_{L1}$$
where $$\mathbf{X}$$ and $$\mathbf{Y}$$ are binary matrices and the definition of the matrix multiplication allows for integers (is not mod-2).

• $$\mathbf{X}$$ is $$\{0,1\}^{H \times M}$$
• $$\mathbf{Y}$$ is $$\{0,1\}^{M \times W}$$
• $$\mathbf{T}$$ is $$[\![0,N]\!]^{H \times W}$$

The only known is $$\mathbf{T}$$. The $$M$$ dimension of $$\mathbf{X}$$, $$\mathbf{Y}$$ is not set and can "grow" for a better solution. Eventually, there may be a desire to find an optimal smallest size of the M dimension of $$\mathbf{X}$$, $$\mathbf{Y}$$ with a given value of the objective function.

One approach I have considered is that the dimension $$M$$ is set by the largest value in $$\mathbf{T}$$ and the corresponding elements in $$\mathbf{X}$$, $$\mathbf{Y}$$ are initialized to $$1$$ and then an optimization algorithm begins. In this case, would I consider a vector input to an optimization algorithm (e.g. scipy.optimize.minimize) and reshape this $$1 \times (H*M+W*M)$$ vector into matrices when evaluating the objective function?

I have looked at the following computational packages but am not sure of the best functions within those packages:

I have also considered solving for $$\textbf{X}$$ (or $$\textbf{Y}$$) as all positive reals (constrained by some maximum) and then converting $$\textbf{X}$$ to a binary matrix by growing the M dimension of $$\textbf{X}$$ based on a "resolution of the discretization". For example, if a matrix element was found to be 0.50 and the resolution is set to 0.1 the resulting binary A matrix would have a run of 5 1s. To use this approach I would solve for the $$\textbf{X}$$ after an in random guess for $$\textbf{Y}$$ using a conventional matrix multiplication solver from numpy such as

> Y = np.random.randint(2,size=(3,4))
> Y_pinv = np.linalg.pinv(Y)
> X = np.dot(T, Y_pinv)


Yet, since $$\textbf{Y}$$ is not optimized the solution is far from optimal.

For fixed $$M$$, you want to minimize $$\sum_{i=1}^H \sum_{j=1}^W \left|\sum_{k=1}^M X_{i,k} Y_{k,j} - T_{i,j}\right|. \tag1$$ You can linearize the absolute value in $$(1)$$ by introducing nonnegative decision variables $$E^+_{i,j}$$ and $$E^-_{i,j}$$, yielding the problem of minimizing $$\sum_{i=1}^H \sum_{j=1}^W (E^+_{i,j} + E^-_{i,j}) \tag2$$ subject to $$\sum_{k=1}^M X_{i,k} Y_{k,j} - T_{i,j} = E^+_{i,j} - E^-_{i,j} \quad \text{for all i and j}. \tag3$$ You can linearize this quadratic constraint $$(3)$$ by introducing binary (or just nonnegative) decision variables $$Z_{i,k,j}$$ to represent $$X_{i,k} Y_{k,j}$$ and imposing linear constraints \begin{align} \sum_{k=1}^M Z_{i,k,j} - T_{i,j} &= E^+_{i,j} - E^-_{i,j} && \text{for all i and j} \tag4 \\ Z_{i,k,j} &\le X_{i,k} &&\text{for all i,k,j} \tag5 \\ Z_{i,k,j} &\le Y_{k,j} &&\text{for all i,k,j} \tag6 \\ Z_{i,k,j} &\ge X_{i,k} + Y_{k,j} - 1 &&\text{for all i,k,j} \tag7 \end{align} The linearized problem is to minimize $$(2)$$ subject to $$(4)$$ through $$(7)$$.