[Math] Find a permutation of the rows of a matrix that minimizes the sum of squared errors

discrete optimizationoptimizationpermutationsprocrustes-problem

I'm struggling with the following problem:

Let $A, B \in \mathbb R^{n \times d}$. Denote by $\mathcal{P}$ the set of all possible permutations of the rows of $A$. Find a permutation $\pi \in \mathcal{P}$ that minimizes $$\sum_{i=1}^n\sum_{j=1}^d \left( A(\pi)_{ij} – B_{ij} \right)^2$$

Is this problem related — or can it be converted — to another well-known optimization problem for which an efficient algorithm is available?
If not, is exhaustive search the only possible approach?

Best Answer

Using the excellent indications given by @Rodrigo de Azevedo under the following form where $\|\|_F$ denotes the Frobenius norm (see remark at the bottom), your issue is equivalent to :

$$\text{minimize} \ \|AP-B\|_F^2=\|AP\|_F^2-2\langle AP,B\rangle+\|B\|_F^2$$

$$\text{minimize} \ \|AP-B\|_F^2=\|A\|_F^2-2 \ \text{trace}(P^TA^TB)+\|B\|_F^2$$

Have a look at https://en.wikipedia.org/wiki/Frobenius_inner_product

As $\|A\|_F^2+\|B\|_F^2$ is constant, it remains to :

$$\text{maximize} \ \text{trace}(PC) \ \text{with} \ C:=A^TB$$

on the group of permutation matrices $P$.

Edit :

Operation trace($PC$) can be understood on the particular $3 \times 3$ case :

$$\text{trace} \begin{pmatrix}1&0&0\\0&0&1\\0&1&0\end{pmatrix}\begin{pmatrix}c_{11}&c_{12}&c_{13}\\c_{21}&c_{22}&c_{23}\\c_{31}&c_{32}&c_{33}\end{pmatrix}=c_{11}+c_{23}+c_{32},$$

A straightforward generalization to the $n \times n$ case explains that the objective is to find, among all sums :

$$s_{\sigma}=c_{1\sigma(1)}+c_{2\sigma(2)}+\cdots+c_{n\sigma(n)} \ \text{,} \ \sigma \in \frak{S}_n$$

the one which is maximal.

Fortunately, this can be done by the efficient Hungarian algorithm applied on matrix $C$, or more exactly because its direct form deals with minimization, an adapted version of it for a maximization context.

Why do we say efficient ? Because this algorithm has complexity $O(n^4)$ instead of the $O(n!)$ complexity of the brute force approach.

Remarks :

In fact, operation $A \to AP$, where $P$ is a permutation matrix, provides a permutation of columns, which is clearly equivalent to a permutation of lines provided by $A \to PA$.
Connected : https://math.stackexchange.com/q/175893.

Best Answer

Related Solutions

[Math] A Proof of Correctness of Durstenfeld’s Random Permutation Algorithm

Linear Algebra – Maximize the Trace of a Matrix by Permuting Its Rows

Related Question