Linear Algebra – Maximizing the Trace in an Elegant Way

linear algebramatricesoptimizationtrace

Suppose $\Lambda = {\rm diag}(\lambda_1,\cdots,\lambda_n)$, $\Sigma = {\rm diag}(\sigma_1,\cdots,\sigma_n)$, and we have $\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n \geq 0$ and $\sigma_1 \geq \sigma_2 \geq \cdots \geq \sigma_n \geq 0$. I want to prove that
$${\rm max}[ {\rm Tr}(O^{\rm T}\Lambda O \Sigma)] = \sum_{i=1}^n \lambda_i \sigma_i,$$
where $OO^{\rm T} = I$. One approach is to maximize the function
$$f(o_{ij}) = \sum_{i,j} \lambda_i (o_{ij})^2 \sigma_j$$
under the constraint $\sum_{k} o_{ik}o_{jk} = \delta_{ij}$ by Lagrange multiplier method. I can prove the statement in this way but the whole proof is lengthy and lack of the flavor of linear algebra. Since the statement "must be true" intuitively, I want to know if there is an elegant way to prove it. A possible way may be mathematical induction but I failed to make it. Any help is appreciated.

Best Answer

(Rewritten from the last part of my answer to Trace minimization with constraints .)

Since the objective function is continuous and the set of all real orthogonal matrices is compact, the maximum trace is continuous in the entries of $\Sigma$. Therefore, by passing to limit, we may assume without loss of generality that $\Sigma$ has distinct diagonal entries.

The matrix exponential of every skew-symmetric matrix is a real orthogonal matrix of determinant one. If $O$ is a global maximiser in your problem, its objective function value must be greater than or equal to the objective function value evaluated at $Oe^{tK}$ (which is real orthogonal) for any real number $t$ and any skew-symmetric $K$. So, if we define $f(t)=\operatorname{tr}(e^{-tK}O^T\Lambda Oe^{tK}\Sigma)$, we must have $f'(0)=0$. Using the cyclic property $\operatorname{tr}(XY)=\operatorname{tr}(YX)$, we get $$ 0=f'(0)=\operatorname{tr}\left((\Sigma O^T\Lambda O-O^T\Lambda O\Sigma)\, K\right). $$ Put $K^T=\Sigma O^T\Lambda O-O^T\Lambda O\Sigma$ (which is indeed skew-symmetric), we get $\operatorname{tr}(K^TK)=0$. Hence $K=0$, meaning that $O^T\Lambda O$ commutes with $\Sigma$. Any matrix that commutes with a diagonal matrix with distinct diagonal entries must have all off-diagonal entries equal to zero. Therefore $O^T\Lambda O$ is a diagonal matrix.

Hence $\Lambda_1:=O^T\Lambda O$ is a diagonal permutation of $\Lambda$, because $\Lambda_1$ and $\Lambda$ are diagonal matrices with the same eigenvalues. Now the problem boils down to finding the diagonal permutation $\Lambda_1$ of $\Lambda$ that maximises $\operatorname{tr}(\Lambda_1\Sigma)$. Obviously, maximum occurs when $\Lambda_1=\Lambda$ or when $O=I$.

Related Question