Linear Algebra – Distance of a Matrix from the Orthogonal Group

inner-productslinear algebramatricesmetric-spaces

Let $\lVert A\rVert
=
\left(\sum_{i,j=1}^n \left\| a_{ij} \right\|^2\right)^{\frac{1}{2}}
=
\sqrt{\operatorname{tr}(A^\top A)}$
be the Frobenius norm on $n \times n$ matrices.

Fix $A \in GL_n(\mathbb{R})$.

1) Is there a formula for the $dist(A,O(n))$?

where $dist^2(A,O(n)) =\underset{X \in O(n)}{\text{min}} \|A – X\|^2$

(O(n) is the orthogonal group , i.e matrices satisfying $X^TX=I_d$,
The minimum exists since $O(n)$ is compact)

Note: In general we don't always have a unique minimizer (i.e there can be more than one orthogonal matrice which is closest to $A$ in $o(n)$), at least if we consider non-invertible matrices $A$.

For example, $A=0$ is in the same distance from each element of $o(n)$ since the Frobenius norm of any isometry is $\sqrt n$.

Question: Can we prove the minimizer is unique (if $A \in GL_n(\mathbb{R})$)? If so, is the function: $GL_n(\mathbb{R}) \to \mathbb{R}^{n^2} \, , \, A \to X(A)$ where $X(A)$ is the minimizer, smooth? Can we provide an explicit formula for it?


I have tried using Lagrange's multipliers, but so far with no success. (I couldn't determine if the gradients of all the $n^2$ constraints are always linearly independent.

Remark:
\begin{align}
\|A – X\|^2 &= \mathrm{tr} \left( (A-X)^t (A-X) \right) \\\
&= \mathrm{tr} \left( (A^t -X^t) (A-X) \right) \\\
&= \mathrm{tr} (A^tA -A^tX – X^tA + X^tX) \\\
&= \mathrm{tr} (A^tA) – \mathrm{tr}(A^tX) – \mathrm{tr}((A^tX)^t) + \mathrm{tr} (I_d)\\\
&= n + \mathrm{tr} (A^tA) – 2\mathrm{tr}(A^tX)
\end{align}

so minimizing $\|A – X\|$ is equivalent to maximizing $\mathrm{tr}(A^tX)$.
(In particular, the objective function to optimize is linear and not quadratic)

Best Answer

I have a guess as to the minimizer. Perhaps there's a neat proof that this happens to be the right answer.

Note that $A$ has a singular value decomposition $$ A = U\Sigma V^T $$ where $\Sigma$ is in this case a square, diagonal matrix whose diagonal elements are the (strictly positive) singular values of $A$, and $U,V$ are orthogonal matrices.

My suspicion is that $$ dist(\Sigma,O(n)) = \|\Sigma - I\| $$ where $I$ is the identity matrix. By the orthogonal invariance of the Frobenius norm, we can equivalently state that $$ dist(A,O(n)) = \|A - UV^T\| = \sqrt{\sum_{j=1}^n [s_j(A)-1]^2} $$ If $UV^T$ does turn out to be a minimizer in general, there is no reason (at first glance) to believe that this minimizer is unique.

There is a condition that would guarantee a unique minimizer that I suspect is true. In particular, I think that $SO(n)$ is the boundary of its own convex hull. If that is the case, then the minimizer of the distance from any $X$ outside of this convex hull is unique.

Note that uniqueness does not hold for elements on the interior of the convex hull. In particular, every element of $O(n)$ is at the minimum distance from $X = 0$. I suspect it is this condition (rather than invertibility) that "causes" uniqueness to fail.


Edit: see levap's answer, which looks correct to me. It seems that a unique minimizer indeed exists if and only if $A$ is invertible.