Nearest Kronecker product to sum of Kronecker products

linear algebramatricesnumerical linear algebra

I am interested in efficiently finding the closest Kronecker decomposition to the sum of $k$ Kronecker products:
$$\min_{A,B} || A \otimes B – \sum_{i=1}^k A_i \otimes B_i ||_F$$
where $A,A_i$ are $p \times p$ and $B,B_i$ are $q \times q$, and $k < 10,p \approx 10,q \approx 50$. Furthermore, $A_i,B_i$ are all PSD.

Someone published an algorithm to solve this here (nkp_sum), but I can find no reference for it or justification for its being any good. In some experiments, it performs quite poorly (ex. nkp_sum(As=[A, A], Bs=[A, 0]) should return (A, A)).

I won't describe the algorithm here because I am not clear on if/when/why it works, but it basically returns a weighted sum of the $A_i$'s and a weighted sum of the $B_i$'s, using the principal eigenvectors of the $A_i$'s and $B_i$'s respective Gram matrices.

Is there a solution to this minimization that takes advantage of the estimand's structure, i.e., that does not 'naively' compute the sum and then decompose it? I think that the naive approach, which computes the SVD of the sum, is $O(\min(p^2 q^4, p^4 q^2))$, and I'd like to get this down. A somehow approximate solution might suffice.

A solution or approximation for $k = 2$ would also be very helpful.


(I wouldn't be surprised if there isn't a good answer since any matrix can be expressed as the sum of Kronecker products, but I'm hopeful that the dimensions and $A_i,B_i$ being PSD might help.)

Best Answer

I may add more detail when I have time, but here is an approach for an approximation: as described in the original Van Loan and Pitsianis paper [1], the minimization $||M - A \otimes B||_F$ can be solved by vectorizing the blocks of $M$ and transforming the first column of the SVD (see the definition of $\mathcal{R}(M)$ and Corollary 2.2).

To efficiently compute an approximation, we use the fact that the action of $\mathcal{R}(\sum_{i=1}^k A_i \otimes B_i)$ on a vector is very fast to compute at $O(k(p^2 + q^2))$ and that we only require the rank-1 approximation. This article [2] has an excellent review of probablistic matrix-decomposition algorithms; equation 6.2 (in section 6.2) has the time complexity of Algorithm 4.1, which computes an orthonormal matrix whose range approximates a given matrix. In this case, I think the time complexity is also simply $O(k(p^2 + q^2))$ because we know that we only require a rank-1 approximation.

[1] Van Loan and Pitsianis, Approximation with Kronecker Products, 1993.

[2] Halko, Martinsson, and Tropp, Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions, 2011.