Minimize Frobenius norm with equality constraint

lagrange multiplierlinear algebramatricesoptimization

I'm facing the following optimisation problem:

\begin{equation}
\begin{aligned}
&\underset{V,A_j}{\text{min}} \sum_j \| X_j-VA_j \|_F^2 \\
&s.t. ~ V^TV = I
\end{aligned}
\end{equation}

(for context, it comes from a shape averaging problem I'm trying to solve, where $X_i ∈ R^{N×2}$ are input matrices sampled from 2D positions of shapes (all rank 2), and I'm trying to seek a shape average $V ∈ R^{N×2}$ , with $A_j$ being arbitrary nonsingular matrices).

I formed a Lagrangian:

\begin{aligned}
L(V,A_j,\lambda)&= \sum_j \mathop{\textrm{Tr}}(X_j^TX_j)-\mathop{\textrm{Tr}}(X_j^TVA_j)-\mathop{\textrm{Tr}}(A_j^TV^TX_j)-\mathop{\textrm{Tr}}(A_j^TA_j)+\mathop{\textrm{Tr}}(\lambda ^T (V^TV-I))\\
\end{aligned}

however I'm finding difficulty finding solutions from this. Ideally I'd like to use SVD decomposition to simplify the result. Any help would be appreciated.

Best Answer

Let $P=\sum_jX_jX_j^T$. Given that $V^TV=I$, we have \begin{align} &\sum_j \| X_j-VA_j \|_F^2\\ &=\operatorname{tr}\sum_j\left(A_j^TA_j-2X_j^TVA+X_j^TX_j\right)\\ &=\operatorname{tr}\sum_j\left(A_j^TA_j-2X_j^TVA+X_j^TVV^TX_j\right) -\operatorname{tr}\sum_jX_j^TVV^TX_j+\operatorname{tr}\sum_jX_j^TX_j\\ &=\sum_j\|V^TX_j-A_j\|_F^2 -\operatorname{tr}(V^TPV) +\operatorname{tr}(P).\tag{1} \end{align} In practice, since the matrices $X_j$s are usually obtained from random, noisy data, the positive eigenvalues of $P$ are almost always distinct. Suppose that this is the case and let $u,v$ be the unique (up to signs) unit eigenvectors of $P$ corresponding to the two largest eigenvalues of $P$. The term $-\operatorname{tr}V^TPV$ is then minimised if and only if $V$ is the augmented matrix $[u,v]$.

It follows that if $V^TX_j$ is nonsingular for every $j$, the global minimum of $(1)$ is attained when $A_j=V^TX_j$ and the minimum value is given by $$ m=0-\left(\lambda_1^\downarrow(P)-\lambda_2^\downarrow(P)\right)+\operatorname{tr}(P)=\sum_{i\ge3}\lambda_i^\downarrow(P). $$

If $V^TX_j$ is singular for some $j$, then $m$ is an unattainable infimum. For instance, if all $X_j$s are zero, then $\sum_j\|X_j-VA\|_F^2=\sum_j\|VA_j\|_F^2$ can be arbitrarily close to zero (by picking, for example, $A_j=\epsilon I_2$ for a smaller and smaller $\epsilon>0$), but it can never be zero because $\|VA_j\|_F>0$ when $V\ne0$ and $A_j$ is nonsingular. Therefore the infimum value $m=0$ is unattainable.