Condition number and singular values

condition numbernumerical linear algebra

Let $m \geq n$ and $A \in \mathbb{R}^{m\times n}$. Let $\sigma_1,\dots, \sigma_n \geq 0$ be the singular values of $A$. We know that if $r$ is the rank of $A$, then $\sigma_1,\dots, \sigma_r$ are positive and the rest are $0$. If $r = n,$ then all singular values are positive, and we can define the condition number of $A$ to be
\begin{align}
\kappa(A) = \frac{\sigma_1}{\sigma_n}
\end{align}

When $1 \leq r < n,$ some authors define $\kappa(A) = \infty$, but I've seen $\kappa(A)$ defined as
\begin{align}
\kappa(A) = \frac{\sigma_1}{\sigma_r}
\end{align}

where $\sigma_r$ is the smallest positive singular value. If a large condition number means the matrix is ill conditioned, the first definition says that less then full rank matrices are ill conditioned by definition. How can we see that less then full rank matrices are ill conditioned using the second definition? Is $\frac{\sigma_1}{\sigma_r}$ always large? What about when $r=1,$ and then $\sigma_1 = \sigma_r,$ and so actually the condition number is 1?

Best Answer

Short answer: these are two different definitions for two different things that are both called "condition number" (and there are many more), of course they don't agree.

Long answer :

The "condition number" of a matrix (or more generally any function) is an indicator for how much the output changes if you change the input. But there are many different ways to precisely define what that is supposed to mean. For example

  • are we talking about relative change or absolute change?
  • how should we measure the size of (the change) of input/output of the function? In multidimensional systems, there are usually many different norms/metrics to choose from.

For matrices, only relative change makes sense, because everything is linear anyway. If you choose the Euclidean norm and restrict yourself to square invertible matrices, your resulting condition number will be equal to the quotient of largest and smallest singular value. So far, everything is good.

For non-invertible matrices (which includes all non square matrices), this definition will give you infinity. This is correct in the sense that an arbitrarily small (relative) change of input can create an arbitrarily large (relative) change of output, so the problem is indeed ill-conditioned in this sense.

But this is not very useful if your goal is to compare different matrices and decide which one is "better behaved" for use in some numerical algorithm. If both options are "infinitely bad," you still don't know which is better (or "less bad"). That's why there are alternative (weaker) definitions of condition number, such as the two definitions you quoted. These produce finite answers in cases where the stronger definitions are infinite, and thus allows you to make an informed choice in cases where all options are "bad" in the sense of the stronger definition.

In any case: if you consider one fixed definition of condition, it will always be true that a big number is bad and a small number is good.

Related Question