Algebraic Geometry – Two-Variable Polynomials with Finite Roots

algebraic-geometrypolynomialsreal-algebraic-geometryroots

I found this interesting exercise (not homework assignment) in real algebraic geometry:

Describe an algorithm which decides whether a given polynomial in $\mathbb{Q}[X, Y]$ has infinitely or finitely many roots in $\mathbb{R}^2$.

There's not much I've found already. It's easy to construct a polynomial with a given number of roots, for example
$$
X^2 + \prod_{k=1}^n (Y – k)^2
$$
has exactly $n$ roots in $\mathbb{R}^2$. The idea used is that any system of polynomials in $\mathbb{Q}[X, Y]$ having finitely many solutions (in this case $\lbrace X = 0, \prod_{k=1}^n (Y – k) = 0 \rbrace$) can be converted into one polynomial in $\mathbb{Q}[X, Y]$ having exactly those solutions as roots using sums of squares. However, not all polynomials in $\mathbb{Q}[X, Y]$ can be written as a sum of squares (or minus a sum of squares), provided that they have finitely many roots. For example, it is well-known that the Motzkin polynomial $F = X^4Y^2 + X^2Y^4 – 3X^2Y^2 + 1$ cannot be written as a sum of squares in $\mathbb{R}[X, Y]$, but on the other hand the polynomial
$$
(1 + X^2)F = (1 – X^2Y^2)^2 + X^2(1 – Y^2)^2 + X^2Y^2(1 – X^2)^2
$$
has exactly the same roots in $\mathbb{R}^2$ as $F$, namely $(1, 1), (1, -1), (-1, 1)$ and $(-1, -1)$. The above equality also shows that $F$ only takes positive values.

In fact, using the middle value theorem, one can easily see that if a polynomial in $\mathbb{R}[X, Y]$ can take both strictly negative and strictly positive values, it must have infinitely many zeroes in $\mathbb{R}^2$. Conversely, though, some polynomials in $\mathbb{Q}[X, Y]$ never take negative values and still have infinitely many zeroes.

Any thoughts on this? Both hints and full answers are welcome.

Best Answer

As a model-theorist, here's how I would think about this problem. I'm sure there are more direct ways of solving it, but I'll describe a "logical" approach.

First, recall that the theory of real closed fields is decidable. This means that for any sentence $\varphi$ in the first-order language of ordered fields, there is an algorithm (albeit an incredibly inefficient algorithm) to determine whether $\varphi$ is true or false in $\mathbb{R}$. You can view this as a special case of effective quantifier elimination for the theory RCF of real closed fields: There is an algorithm which finds, for any first-order formula $\varphi(x_1,\dots,x_n)$, an equivalent quantifier-free formula $\psi(x_1,\dots,x_n)$. A sentence is just a formula with no free variables, and quantifier-free sentences are easily decidable.

OK, but the statement "there exist infinitely many $(x,y)$ such that $p(x,y) = 0$" isn't expressible as a first-order sentence - first-order logic doesn't have the quantifier "there exist infinitely many". So let's break down what this means.

Note that if $p$ has infinitely many roots, then either (1) there is some $a$ such that there are infinitely many $b$ with $p(a,b) = 0$, or (2) there are infinitely many $a$ such that there is some $b$ with $p(a,b) = 0$. So our condition is equivalent to $$(\exists x\,\exists^\infty y\,p(x,y) = 0)\lor (\exists^\infty x\,\exists y\,p(x,y)),$$ where $\exists^\infty$ stands for "there exist infinitely many".

Looking at case (1) first, note that for any fixed $x = a$, $p(a,y)$ is a polynomial in $y$, and $\exists^\infty y\,p(a,y) = 0$ if and only if $p(a,y)$ is the zero polynomial. Writing $p(x,y) = \sum_{i=0}^d q_i(x)y^d$, we see that $p(a,y) = 0$ if and only if $a$ is a common root of the polynomials $\{q_0(x),\dots,q_d(x)\}$. Thus, $\exists x\, \exists^\infty y\,p(x,y)$ is equivalent to $\exists x\, \bigwedge_{i=0}^d q_i(x) = 0$, and now we can apply the decision procedure for RCF to answer this question.

Turning to case (2), let $\varphi(x)$ be the formula $\exists y\, p(x,y)$. It is a consequence of quantifier elimination that any formula in one free variable defines a finite union of points and intervals (the is what it means to say that RCF is o-minimal), and a finite union of points and intervals is infinite if and only if it contains a nonempty open interval. Thus, $\exists^\infty x\, \varphi(x)$ is equivalent to $$\exists z_1\, \exists z_2\, ((z_1<z_2)\land (\forall w\, (z_1<w<z_2)\rightarrow \varphi(w))),$$ and now we can apply the decision procedure for RCF to answer this question.

Having given an algorithm to decide whether each of the cases is true, we're done.

You might want to dig deeper and see if you can replace the two applications of the decision algorithm black box with more concrete (and efficient!) algorithms. In part (1) we just asked whether a system of polynomials in one variable has a common root, which can be done using Gröbner bases. The application in part (2) was a bit more complicated, but it can be broken down as follows. First, we could apply quantifier elimination to $\varphi(x)$ (which was $\exists y\, p(x,y) = 0$). This amounts to coming up with a quantifier-free condition on $a$ which is equivalent to the polynomial $p(a,y)$ having a root, which can be done by working carefully through an application of Sturm's theorem to $p(a,y)$, and keeping track of how properties of $a$ affect the answer. Now letting $\psi(x)$ be the quantifier-free equivalent to $\varphi(x)$, we can put $\psi(x)$ in disjunctive normal form as $\bigvee_{k=1}^m \psi_k(x)$ and ask whether any of the $\psi_k(x)$ contain an open interval. Each of the $\psi_k$ is a conjunction of atomic and negated atomic conditions, which is equivalent to a system of polynomial inequalities, each of the form $q_i(x) \leq 0$ or $q_i(x) \geq 0$. You can test whether the solution set of this system contains a nonempty open interval by checking to see whether any of the polynomials involved share roots, and then approximating the roots sufficiently precisely.


Bonus note: Despite the fact that first-order logic doesn't have the quantifier $\exists^\infty$ ("there exist infinitely many"), one can still talk about "elimination of the quantifier $\exists^\infty$". A theory $T$ eliminates $\exists^\infty$ if for every formula $\varphi(x_1,\dots,x_n,y)$, there is a formula $\psi(x_1,\dots,x_n)$ such that for any tuple $\overline{a}$ from a model $M$ of $T$, $M\models \psi(a_1,\dots,a_n)$ if and only if there are infinitely many $b$ such that $M\models \varphi(a_1,\dots,a_n,b)$. This elimination is effective if there's an algorithm which produces $\psi$ from $\varphi$.

Now RCF eliminates $\exists^\infty$ (in fact, every o-minimal theory does, see the Finiteness Lemma 3.1.7 in Tame topology and o-minimal structures by van den Dries). And I'm confident that it does so effectively, which would answer the question directly. But I don't know a reference off the top of my head, and writing down the algorithm would involve similar reasoning to what I wrote above, but in much greater generality.

Related Question