How can you tell if a least squares/rootfinding problem is well conditioned only by calculating the roots of a polynomial fit

numerical linear algebranumerical methodspolynomials

Okay, so I'm working on a problem in which, after being given about 10 data values, we fit a polynomial of degree 3 to the data. After finding the roots of the polynomial (2 of which are imaginary, and the 3rd one is around $-2000$), can we tell if a problem is well conditioned or ill conditioned just by this?

I was reading the Trefethan & Bau book on numerical LA, and saw something about how "The determination of the roots of a polynomial, given the coefficients, is a classic example of an ill-conditioned problem," and "Polynomial rootfinding is typically ill-conditioned even in cases that do not involve multiple roots."

My confusion is, when I added small perturbations to my cubic polynomial's coefficients, I saw that the roots that I was getting were nearly identical? So much so that adding $0.0001$ to the $x^0$ coefficient gave me roots that were different only after the 5th decimal place. Even adding $1$ gave me roots that differed after the first decimal place, but were still nearly the same.

These roots seem pretty similar and don't seem to change a lot, so does that mean that it's a well conditioned problem? If not, how can I tell?

I feel like I might be missing something in order to understand this properly, but I'm honestly not sure what. I would really appreciate if anyone could help!

Best Answer

No, in general you cannot judge the conditioning of a problem by solving a single instance of the problem. You need to investigate if the output is sensitive to small changes in the input.

In your specific case the input consist of $n=10$ data values and the output is, say, the real root. You have a real function $f : \mathbb{R}^n \rightarrow \mathbb{R}$. The componentwise relative condition number $\kappa$ can in principle be calculated as $$\kappa = \frac{|Df(x)||x|}{|f(x)|},$$ where $Df(x) \in \mathbb{R}^{1 \times n}$ denotes the Jacobian of $f$ at the point $x \in \mathbb{R}^{n \times 1}$. I imagine that this is not entirely trivial to compute the Jacobian, but this is the way forward if we need certainty. Until then you could experiment a bit with small changes of your data points. If the root does not change a lot, then you have no reason to suggest that the root is ill-conditioned.

When you changed the coefficient of the polynomial you were conducting a somewhat different experiment. You were changing the intermediate results rather than your original input.

On a related note consider a smooth family of polynomials, say, $$p(t,x) = \sum_{j=0}^m a_j(t) x^j$$ and suppose that we have found a differentiable function $t \rightarrow \mathbb{R}$ such that $$p(t,x(t)) = 0.$$ It is natural to ask if $x$ is sensitive to small changes in $t$. By the chain rule we have $$0 = p_t(t,x(t)) + p_x(t,x(t))x'(t)$$ and if $p_x(t,x(t)) \not = 0$, then $$ x'(t) = -\frac{p_t(t,x(t))}{p_x(t,x(t))}. $$ It is clear that large value of $\frac{x'(t)}{x(t)}$ are quite possible especially in the vicinity of double root $x_0$ where $$p(t_0,x_0) = p_x(t_0,x_0) = 0.$$ By the same token, if $p_t$ is large, then the roots can be sensitive to small changes in $t$ even when they are well separated.


The example in your books plays of $$x^2 + 2x + 1 = (x+1)^2,$$ hence $x=-1$ is a double root. The roots of $x^2+2x+c = 0$ are $x = -1 \pm \sqrt{1-c}.$ Hence when $c = 1 - \delta$ we have $x = -1 \pm \sqrt{\delta}$. We observe that the perturbation of the root is $O(\sqrt{\delta})$ rather than $O(\delta)$. A well conditioned root would have a perturbation that was $O(\delta)$. Here the root is extremely sensitive to changes in the last coefficient.
Saying that a root is ill-conditioned refers to a specific root. Other roots of the same polynomial may be well-conditioned. Say that the root finding problem is ill-condition refers to the fact that there exists polynomials that have at least one ill-conditioned root.

In general the conditioning of a specific problem comes down to identifying a specific function, its domain and codomain and then investigating how sensitive the output is to small changes in the input. One can measure size in many different ways, but this is a detail that is not significant at this point. In your original problem, the polynomial is simply an means to an end. The output is the root, and the input consists of the data points that determine the polynomial. Those are the points that you should systematically perturb when you worry about the conditioning of a specific root.