[Math] Optimization problem on trace of rotated positive definite matrices

linear algebramatricesmatrix analysisoc.optimization-and-controlquadratic-forms

Given two $n \times n$ symmetric positive definite matrices $A$ and $B$, I am interested in solving the following optimization problem over $n \times n$ unitary matrices $R$:
$$
\mathrm{arg}\max_R \,\mathrm{trace}(RAR^TB)~~~\text{s.t.}~~~RR^T = I_n~.
$$
More generally, given two sets of $m$ positive definite matrices $\{A_i\}_{i=1}^m$ and $\{B_i\}_{i=1}^m$ I would like to solve:
$$
\mathrm{arg}\max_R \,\sum_i\mathrm{trace}(RA_iR^TB_i)~~~\text{s.t.}~~~RR^T = I_n~.
$$

If I recall correctly, I have seen the following inequality
$$
\mathrm{trace}(R\,\mathrm{diag}({\bf c})\,R^T\,\mathrm{diag}({\bf d})) \le {\bf c}^T{\bf d}
$$
for positive vectors ${\bf c}$ and ${\bf d}$. If this inequality is correct, then $R=I_n$ provides the optimal solution for diagonal matrices and using spectral decomposition of non-diagonal $A$ and $B$ we can solve the problem in the case of $m = 1$. Can somebody please show me how to prove this inequality? More importantly, Is there a way to solve the problem in the more general case of $m > 1$?

Best Answer

If your $B_i$ commute among themselves, and the $A_i$ commute among themselves, then the $m=1$ solution extends somewhat. Indeed, in that case write $A_i=V C_i V^T$, $B_i=UD_iU^T$ where $U,V$ are unitary and $C_i,D_i$ diagonal. Now, optimizing over $R$ becomes the same (after the right and left multiplication by $V^T$ and $U$) as optimizing over $W$ orthogonal in the expression $$tr \sum_i WD_iW^TC_i=\sum_{j} \sum_{i,k} w_{jk}^2D_i(k)C_i(j)= \sum_{j,k} s_{jk} \sum_i D_i(k)C_i(j)=: \sum_{j,k} s_{jk} T(k,j),$$ where $T(k,j)=\sum_i D_i(k)C_i(j)\geq 0 $. Here, the $s_{jk}$ form a doubly stochastic matrix. By Birkhoff's theorem, the extreme points of the set of doubly stochastic matrices are permutations, so the optimal solution for the above convex problem is a permutation matrix $\{s_{j,k}\}$ ; unlike the case where $\sum_i$ is trivial, I am not sure there is a simple description of the optimizing permutation.

Remark: when $m=1$, the argument above shows the inequality you wanted: the optimal permutation in that case is the one that orders the eigenvalues of $B$ in the same order as those of $A$.

Related Question