Computability Theory – Why is Uncomputability of the Spectral Decomposition Not a Problem?

computability-theorylinear algebramatricesnumerical linear algebra

Below, we compute with exact real numbers using a realistic / conservative model of computability like TTE.

Assume that there is an algorithm that, given a symmetric real matrix $M$, finds an eigenvector $v$ of $M$ of unit length.

Let

$$M(\epsilon) = \begin{cases}
\left[\begin{matrix}1 & \epsilon\\ \epsilon & 1\end{matrix}\right]
,& \epsilon \geq 0 \\
\left[\begin{matrix}1 – \epsilon & 0\\0 & 1 + \epsilon\end{matrix}\right]
,& \epsilon \leq 0
\end{cases}$$

and assume that it's possible to find an eigenvector $v$ of $M(\epsilon)$.

  • If $\epsilon > 0$ then $v$ must necessarily be $\pm \frac 1 {\sqrt 2}\left[\begin{matrix}1\\1\end{matrix}\right]$ or $\pm \frac 1 {\sqrt 2}\left[\begin{matrix}-1\\1\end{matrix}\right]$. Observe that in all four cases, the $L^1$ norm of $v$ is $\sqrt 2$.

  • If $\epsilon < 0$, then $v$ must necessarily be $\pm\left[\begin{matrix}1\\0\end{matrix}\right]$ or $\pm\left[\begin{matrix}0\\1\end{matrix}\right]$. Observe that in all four cases, the $L^1$ norm of $v$ is $1$.

It's easily determinable whether the $L^1$ norm of $v$ is less than $\sqrt 2$ or greater than $1$. Therefore we can decide whether $\epsilon \leq 0$ or $\epsilon \geq 0$, which is impossible!

In a way, this is strange, because many sources say that the Singular Value Decomposition (SVD) and Schur Decomposition (which are generalisations of the Spectral Decomposition) are numerically stable. They're also widely used in numerical applications. But I've just tested the examples above for small $\epsilon$ using SciPy and got incorrect results.

So my question is, how do numerical analysts get around this problem? Or why is this apparently not a problem?

I could venture some guesses: While finding eigenvectors of general matrices may be impossible, it is possible to find their eigenvalues. Also, it's possible to "shift" a problematic matrix by some small $\epsilon$ so that its eigendecomposition is computable.

Best Answer

The singular value decomposition, when applied to a real symmetric matrix $A = \sum_i \lambda_i(A) u_i(A) u_i(A)^T$, computes a stable mathematical object (spectral measure $\mu_A = \sum_i \delta_{\lambda_i(A)} u_i(A) u_i(A)^T$, which is a projection-valued measure) using a partially unstable coordinate system (the eigenvalues $\lambda_i(A)$ and eigenvectors $u_i(A)$; the eigenvalues are stable, but the eigenvectors are not). The numerical instability of the latter reflects the coordinate singularities of this coordinate system, but does not contradict the stability of the former. But in numerical computations we have to use the latter rather than the former, because standard computer languages have built-in data representations for numbers and vectors, but usually do not have built-in data representations for projection-valued measures.

An analogy is with floating-point arithmetic. The operation of multiplication of two floating point numbers (expressed in binary $x = \sum_i a_i(x) 2^{-i}$ or decimal $x = \sum_i b_i(x) 10^{-i}$) is a stable (i.e., continuous) operation on the abstract real numbers ${\bf R}$, but when viewed in a binary or decimal representation system becomes "uncomputable". For instance, the square of $1.414213\dots$ could be either $1.99999\dots$ or $2.0000\dots$, depending on exactly what is going on in the $\dots$; hence questions such as "what is the first digit of the square of $1.414213\dots$" are uncomputable. But this is an artefact of the numeral representation system used and is not an indicator of any lack of stability or computability for any actual computational problem that involves the abstract real numbers (rather than an artificial problem that is sensitive to the choice of numeral representation used). In contrast, floating point division when the denominator is near zero is a true singularity; regardless of what numeral system one uses, this operation is genuinely discontinuous (in a dramatic fashion) on the abstract reals and generates actual instabilities that cannot be explained away as mere coordinate singularity artefacts.

Returning back to matrices, whereas the individual eigenvectors $u_i(A)$ of a real symmetric matrix $A$ are not uniquely defined (there is a choice of sign for $u_i(A)$, even when there are no repeated eigenvalues) or continuously dependent on $A$, the spectral measure $\mu_A := \sum_i \delta_{\lambda_i(A)} u_i(A) u_i(A)^T$ is unambiguous; it is the unique projection-valued measure for which one has the functional calculus $$ f(A) = \int_{\bf R} f(E)\ d\mu_A(E)$$ for any polynomial $f$ (or indeed for any continuous function $f \colon {\bf R} \to {\bf R}$). The spectral measure $\mu_A$ depends continuously on $A$ in the vague topology; indeed one has the inequality $$ \| f(A) - f(B) \|_F \leq \|f\|_\text{Lip} \|A-B\|_F$$ for any real symmetric $A,B$ and any Lipschitz $f$, where $\|\|_F$ denotes the Frobenius norm (also known as the Hilbert-Schmidt norm or 2-Schatten norm). This allows for the possibility for stable computation of this measure, and indeed standard algorithms such as tridiagonalisation methods using (for instance) the QR factorisation and Householder reflections do allow one to compute this measure in a numerically stable fashion (e.g., small roundoff errors only lead to small variations in any test $\int_{\bf R} f(E)\ d\mu_A(E)$ of the spectral measure $\mu_A$ against a given test function $f$), although actually demonstrating this stability rigorously for a given numerical SVD algorithm does require a non-trivial amount of effort.

The practical upshot of this is that if one uses a numerically stable SVD algorithm to compute a quantity that can be expressed as a numerically stable function of the spectral measure (e.g., the inverse $A^{-1}$, assuming that the spectrum is bounded away from zero), then the computation will be stable, despite the fact that the representation of this spectral measure in eigenvalue/eigenvector form may contain coordinate instabilities. In examples involving eigenvalue collision such as the one you provided in your post, the eigenvectors can change dramatically (while the eigenvalues remains stable), but when the time comes to apply the SVD to compute a stable quantity such as the inverse $A^{-1}$, these dramatic changes "miraculously" cancel each other out and the algorithm becomes numerically stable again. (This is analogous to how a stable floating point arithmetic computation (avoiding division by very small denominators) applied to an input $x = 1.99999\dots$ and an input $x' = 2.00000\dots$ will lead to outcomes that are very close to each other (as abstract real numbers), even though all the digits in the representations of $x$ and $x'$ are completely different; the changes in digits "cancel each other out" at the end of the day.)

[The situation is a bit more interesting when applying the SVD to a non-symmetric matrix $A = \sum_i \sigma_i(A) u_i(A) v_i(A)^T$. Now one gets two spectral measures, $\mu_{(A^* A)^{1/2}} = \sum_i \delta_{\sigma_i(A)} v_i(A) v_i(A)^T$ and $\mu_{(AA^*)^{1/2}} = \sum_i \delta_{\sigma_i(A)} u_i(A) u_i(A)^T$ which are numerically stable, but these don't capture the full strength of the SVD (for instance, they are not sufficient for computing $A^{-1}$). The non-projection-valued spectral measure $\mu_A = \sum_i \delta_{\sigma_i(A)} u_i(A) v_i(A)^T$ does capture the full SVD in this case, but is only stable using the vague topology on the open half-line $(0,+\infty)$, that is to say $\int_0^\infty f(E)\ d\mu_A(E)$ varies continuously with $A$ as long as $f$ is a test function compactly supported in $(0,+\infty)$, but is unstable if tested by functions that do not vanish at the origin. This is ultimately due to a genuine singularity in the polar decomposition of a non-selfadjoint matrix when the matrix becomes singular, which in one dimension is simply the familiar singularity in the polar decomposition of a complex number near the origin.]

Related Question