Why subtracting 1 is considered ill-conditioned

condition numberfloating pointnumerical methods

I was reading the following to better understand stability:

Consider evaluating $f(x) = \sqrt{1+x}-1$ for $x$ near $0$. $C_f(x) = \frac{\sqrt{1+x}+1}{2\sqrt{1+x}}$ so $C_f(0) = 1$ so it is not
ill-conditioned. To compute this we would have the 3 steps: $(1)\ t_1
= 1+x$, $(2)\ t_2 = \sqrt{t_1}$, and $f(x)=t_2-1$. Steps $(1)$ and $(2)$ are well-conditioned (condition numbers $0$ and $1/2$) while
step $(3)$ is ill-conditioned. Thus we would say the algorithm is
unstable.

I'm trying to understand why exactly the third step is ill-conditioned. Is it due to catastrophic cancellation because $t_1$ would be close to $1$? Is $1+x$ and $\sqrt{t_1}$ just not sensitive to perturbations due to floating point errors but $t_2-1$ is?

Best Answer

When you subtract two nearly equal floating point numbers you lose precision. In your example suppose we are working with seven place base $10$ numbers and let $x=10^{-4}$. Then $t_1=1+x=1.000100$ is exact. $t_2=\sqrt {t_1}=1.000050$ is within $\frac 12LSB$. But when we subtract $t_2-1$ we get $5.0 \cdot 10^{-5}$ and only those two places are good. The leading five places canceled out. That is the catastrophic loss of precision.

Related Solutions

[Math] Well-conditioned problem vs Stable algorithm

Consider a linear problem $$Ax = b \tag{1}$$ This problem is well conditioned if difference between $\textbf{exact}$ solution to this problem and $\textbf{exact}$ solution of a slighty perturbed problem $(A+\delta A)y = b+\delta b$ is small for any small perturbations $\delta A$, $\delta b$. If the problem is badly conditioned, then there exists some small perturbations $\delta A$, $\delta b$ such that difference between $x$ and $y$ is large.

Suppose, that the problem (1) is solved using some algorithm and denote the solution by $\hat{x}$. Usually $\hat{x}$ is not an exact solution to (1). A method used to solve (1) is called backward stable if for any $A$ and $b$ it computes a solution $\hat{x}$ with small backward error, that is $\hat{x}$ is an exact solution of some slightly perturbed problem $(A+\delta A)\hat{x} = b+\delta b$. For a problem (1), the backward error is closely related to norm of residuals $r = A\hat{x} - b$.

Forward error is defined as a distance between true solution $x$ to (1) and solution computed by given method $\hat{x}$. This forward error can be estimated as $$\text{forward error} \leq \text{condition number} \times \text{backward error} \tag{2}$$

The backward error and condition number are completely unrelated. It is possible to have small backward error for badly conditioned problem or large backward error for well conditioned problem. For example consider a problem $$\left[\begin{array}{cc}\epsilon & 1\\1 &\epsilon\end{array}\right]x = b$$ for some small $\epsilon$. This problem is well conditioned. However if this problem is solved using LU method without pivoting, then the backward error is large and computed solution is highly inaccurate.

If condition number is low and a stable algorithm is used (thus the backward error is also small), then we can be sure, that (1) is solved accurately. It is however possible, that an unstable algorithm applied to badly conditioned problem gives an accurate solution. It is always possible to estimate the forward error for a given problem (1) using aposteriori error analysis, which do not use (2).

Is there a stable algorithm for every well-conditioned problem

The second statement is false as I shall demonstrate now. Consider the problem of computing the function $f : \mathbb{R} \times \mathbb{R} \rightarrow \mathbb{R}$ given by $$f(x,y) = x - y.$$ I will assume that we are using floating point numbers that respect the IEEE standard. I will assume that $x-y$ is in the representable range. Then there exists a $\delta$ such that the computed value $\hat{z}$ of the real number $z = x - y$ satisfies $$\hat{z} = (x-y)(1+\delta)$$ and $|\delta| \leq u$ where $u$ is the unit roundoff. We observe that $$\hat{z} = f(\bar{x}, \bar{y}),$$ where $$ \bar{x} = x(1+\delta), \quad \bar{y} = y(1+\delta).$$ This shows that the computed value of $z$ is the exact value of $f$ evaluated at a point $(\bar{x},\bar{y})$ which satisfies $$|x-\bar{x}| \leq u |x|, \quad |y-\bar{y}| \leq u |y|.$$ This shows that the naive algorithm for computing $z$ is componentwise relative backwards stable. This the highest and most demanding form of stability.

I shall now show that problem of computing $z$ is ill-conditioned in the relative sense when $x \approx y$ and well-conditioned in the relative sense when $x$ and $y$ are not close. Specifically, suppose that we desire to compute $z=x-y$, but we only have good approximations $$\hat{x} = x(1+\alpha), \quad \hat{y} = y(1+\beta)$$ of $x$ and $y$, then we cannot hope to do better than compute $\hat{x} - \hat{y}$. The relative error between the number we want and the best approximation that we can hope to compute satisfies

$$\left|\frac{(x-y) - (\hat{x} - \hat{y})}{(x-y)} \right| \leq \max\{|\alpha|,|\beta|\}\frac{|x| + |y|}{|x-y|}$$

Small values of $|\alpha|$ and $|\beta|$ reflect the fact that $\hat{x}$ and $\hat{y}$ are good approximations of $x$ and $y$, but the right-hand side will be large when $x \approx y$. In this case we have no guarantee that the relative error will be small, and in general it will be large. On the other hand if $|x| \ge 2|y|$ or $|y| \ge 2|x|$, then the triangle inequalities implies that $$ \frac{|x| + |y|}{|x-y|} \leq 3$$ and we have a guarantee that the relative error will be at most 3 times the largest relative error associated with either $x$ or $y$.

In summary, the naive algorithm for computing $f(x,y) = x - y$ is always backwards stable in the componentwise relative sense, but the problem is ill-conditioned for some input and well-conditioned for most input.

In general, there is no connection between the condition of a function $g$ and the stability of the algorithm that is used to evaluate $g$. This is a good thing! It dramatically simplifies the analysis of the software that implements the algorithm. First we analyze the function $f$. If $f$ is always well-conditioned, then a stable algorithm should return a good approximation of $y = f(x)$ when supplied with a good approximation $\hat{x} \approx x$. If there are points $x$ where $f$ is ill-conditioned, then even a stable algorithm cannot hope to return an accurate approximation of $y = f(x)$, because we are unable to provide the machine with the exact value of $x$. In general, we cannot hope to do better than $\hat{x} = \text{fl}(x)$, the floating-point representation of $x$.

I shall now demonstrate that the first statement is false. Let $f : \mathbb{R}^{n} \times \mathbb{R}^n \rightarrow \mathbb{R}^{n \times n}$ denote the outer product, i.e., $$f(x,y) = xy^T.$$

I shall now demonstrate that $f$ is well-conditioned, but that no algorithm for computing $f$ can be backwards stable.

We begin by exploring the sensitivity of $f$ to small perturbations of the input. Let $\epsilon>0$ be a small fixed number and let $\Delta x$ and $\Delta y$ be perturbations of $x$ and $y$ such that $$ |\Delta x| \leq \epsilon |x|, \quad |\Delta y| \leq \epsilon |y|$$ These inequalities should be understood in the componentwise sense, i.e., $$ |(\Delta x)_i | \leq \epsilon |x_i|$$ and similarly for $y$. Then $$ f(x + \Delta x, y + \Delta y) = xy^T + x(\Delta y)^T + (\Delta x)y^T + (\Delta x)(\Delta y)^T.$$ It follows that $$ f(x + \Delta x, y + \Delta y) - f(x,y) = x(\Delta y)^T + (\Delta x)y^T + (\Delta x)(\Delta y)^T$$ We conclude that $$ |f(x + \Delta x, y + \Delta y) - f(x,y)| \leq 2 \epsilon |x||y|^T + \epsilon^2|x||y|^T = (2 \epsilon + \epsilon^2) |xy|^T = (2 \epsilon + \epsilon^2)|f(x,y)|$$ Again, this inequality should be understood in the componentwise sense. In short, for every single component of $f$, then error caused by perturbing the input argument is small relative to the corresponding component of $f$. We conclude that $f$ is well-conditioned with respect to perturbations that are small in the componentwise relative sense.

Now suppose that $\hat{z}$ is the computed value of the matrix $z = f(x,y) = xy^T$. In general, we cannot have $\hat{z} = f(\bar{x},\bar{y})$! Why is that? Well, if $x, y \in \mathbb{R}^n$, then we have only $2n$ degrees of freedom, but there are $n^2$ equations to satisfy. We have to find $\Delta x$ and $\Delta y$ such that $$\hat{z} = (x+\Delta x)(y + \Delta y)^T$$ and in general, this is not possible. In particular, there is no hope of $\Delta x$ and $\Delta y$ that are small relative to $x$ and $y$. We conclude that no algorithm for computing $f$ can be backwards stable in the componentwise relative sense.

I recommend that you set the note aside. These are the two best references that I know

The paper "Mixed, componentwise and structured condition numbers" by Israel Gohberg and Israel Koltracht. SIAM Journal of Matrix Analysis and Applications, 1993, Vol. 14, No. 3 : pp. 688-704 https://doi.org/10.1137/0614049. The treatment of condition numbers is very systematic.
The textbook "Accuracy and stability of numerical algorithms" by Nichlas J. Higham. SIAM 2002. https://doi.org/10.1137/1.9780898718027. This is a good book on many different topics. The treatment of the conditioning of linear systems is excellent.

They are not easy but they are good. Many of the answers that I have provided here are derived from these texts.

EDIT: I am not sure that it is possible to repair the statements in the note. I am sure that it the following is safe to teach:

If the problem is well-conditioned, then a stable algorithm will produce an accurate result. If the problem is ill-conditioned, then we cannot expect that a stable algorithm will produce an accurate result.

I can also phase it like this: If the problem is well-conditioned, then it is not theoretically impossible to write a computer program that returns accurate results. If the problem is ill-conditioned, then we cannot expect that any computer program will return accurate results.

Best Answer

Related Solutions

[Math] Well-conditioned problem vs Stable algorithm

Is there a stable algorithm for every well-conditioned problem

Related Question