[Math] About the proof of the proposition “there exists irrational numbers a, b such that a^b is rational”

constructive-mathematicslo.logicproof-theory

What does the classical proof of the proposition "there exists irrational numbers a, b such that $a^b$ is rational" want to reveal? I know it has something to do with the difference between classical and constructive mathematics but am not quite clear about it. Materials I found online does not give quite clear explanations either. Could someone give a better explanation?

Best Answer

Presumably, the proof you have in mind is to use $a=b=\sqrt2$ if $\sqrt2^{\sqrt2}$ is rational, and otherwise use $a=\sqrt2^{\sqrt2}$ and $b=\sqrt 2$. The non-constructivity here is that, unless you know some deeper number theory than just irrationality of $\sqrt 2$, you won't know which of the two cases in the proof actually occurs, so you won't be able to give $a$ explicitly, say by writing a decimal approximation.

Related Solutions

[Math] Difference between constructive Dedekind and Cauchy reals in computation

In line with what is actually asked in the bold part of the question, I have taken the liberty of changing the title from constructive mathematics to computation. As it stood it was essentially a duplicate of another one, which already contains a lot about constructive but non-computational mathematics and so that would be the appropriate forum for further such discussion.

In fact Valery tagged the answer to the end of his first paragraph, namely @AndrejBauer's Efficient Computation with Dedekind Reals, which describes the ideas behind his Marshall calculator, whilst being based on our Dedekind Reals in Abstract Stone Duality and my Lambda Calculus for Real Analysis.

The setting for this answer is my ASD programme. I refuse to carry the baggage of obsolete programming languages (Turing Machines and Gödel Codings), any form of set theory (CZF, which I think is the setting for Lubarsky and Rathjen's paper, or even a topos) or the two bolted together (computable analysis in Weihrauch's TTE).

Valery's definition of a Dedekind cut is correct, but (to underline my rejection of any form of set theory) I write $\delta d$ and $\upsilon u$ for the down and up predicates instead of his $q\in L$ and $q\in U$ for lower and upper sets.

The predicate $\delta d$ says that (we so far know that) $d\lt x$ where $x$ is the real number being defined and similarly $\upsilon u$ means that (we so far know that) $x\lt u$.

Where this knowledge is incomplete, the pair $(\delta,\upsilon)$ satisfies all but the last axiom (locatedness) and is equivalent to an interval.

In particular, Dedekind cuts are two-sided. Taking the complement would amount to deducing knowledge or termination from ignorance or divergence.

There are plenty of examples of limiting processes that define "uncomputable" numbers that are best thought of as one-sided cuts. For example, the domain of definition of the solution of a differential equation is something that is only known from the inside; it is remarkable if we can obtain some knowledge from the outside, not that the boundary is "uncomputable" in this sense.

In the case where the cut defines a rational number, it should occur on neither side (Dedekind's choice was wrong here).

Defining the usual stuff in elementary real analysis is much easier and more natural using Dedekind cuts than with Cauchy sequences. Consider Riemann integration: $\delta d$ holds if there is some polygon (piecewise linear function) that lies below the curve and has area more than $d$; deducing the various properties from this is easy. This is gratuitously difficult if you insist in advance that the polygon have a particular shape of resolution $2^{-n}$ to make a Cauchy sequence.

The essential difference between a Dedekind cut and a Cauchy sequence as the representation of the solution of a mathematical problem is that Cauchy assumes that the problem has already been solved (to each degree of precision), whereas Dedekind merely combines all the formulae ("compiles" them into an "intermediate code", if you like), so that the problem remains to be solved.

This means that deriving a Cauchy sequence or decimal expansion from a Dedekind cut in general is just as difficult as solving a general real-valued mathematical problem. So I am under no illusion about the difficulty in implementing this proposal.

But it is a novel and so valuable way of looking at things.

The predicates define open subspaces of $\mathbb R$, so I would think that, in terms of recursion, they have to be $\Sigma^0_1$ or recursively enumerable. (I have thought a little bit about "descriptive set theory" (a name that needs to be changed) but haven't got very far with it.)

These predicates $\delta$ and $\upsilon$ are in the language of ASD and you will need to read the papers that I have linked to find out the details.

In the first instance, such predicates are built from $f(\vec x)\lt g(\vec x)$, where $f$ and $g$ are polynomials, using $\land$, $\lor$, $\exists$, recusion over $\mathbb N$ and universal quantification over closed bounded intervals.

We are given that $\delta d$ and $\upsilon u$ hold for certain rationals $d$ and $u$ and want to find some $d\lt m\lt u$ that is nearer to the result and whether $\delta m$ or $\upsilon m$.

In the case of $f(\vec x)\lt g(\vec x)$ we can use the (formal, interval) derivatives of $f$ and $g$ and the Newton--Raphson algorithm to do this. As is well known, this algorithm doubles the number of bits of precision at each step. This would justify the claim that the computation is efficient if we can extend it to more general formulae.

For more complicated logical expressions, Andrej developed a method of approximating these expressions with simpler ones. See his paper for details of this. This is quite closely analogous to methods of approximating real-valued functions by low(er) degree polynomials that have a very long history but have been developed in interval computation by Michal Konecny. Whilst Michal himself has used these computationally, they have not yet been incorporated into Andrej's calculator, but I hope that this will be done in future.

Computations like this with so-called "exact real arithmetic" are necessarily non-deterministic with regard to the Cauchy sequences tat they produce, which means that the existence of the completed sequence necessarily relies on Dependent Choice. This is entirely natural from a computational point of view and I don't see what use there is in discussing it as an issue of constructivity.

[Math] Bishop quote stating that axiom of choice is constructively valid

According to the BHK interpretation of intuitionistic logic we have that:

A proof of $\exists x \in A . \phi(x)$ consists of a pair $(a, p)$ where $a \in A$ and $p$ is a proof of $\phi(a)$.
A proof of $\forall x \in A . \phi(x)$ is a method which takes as input $a \in A$ and outputs a proof of $\phi(a)$.
A proof of $\psi \Rightarrow \xi$ is a method which takes proofs of $\psi$ to proofs of $\xi$.

Here "method" should be understood as an unspecified, pre-mathematical notion. It could be algorithm, or continuous map, or mental process, or Turing machine, etc.

The axiom of choice can be stated, for any sets $A$, $B$ and relation $\rho$ on $A \times B$, as: \begin{equation} (\forall x \in A . \exists y \in B . \rho(x,y)) \Rightarrow (\exists f \in B^A . \forall x \in A . \rho(x, f(x)). \tag{AC} \end{equation} This is equivalent to the usual formulation (exercise, or ask if you do not see why). Let us unravel what it means to have a proof of the above principle. First, a proof of $$\forall x \in A . \exists y \in B . \rho(x,y) \tag{1}$$ is a method $C$ which takes as input $a \in A$ and outputs a pair $$C(a) = (C_1(a), C_2(a))$$ such that $C_1(a) \in B$ and $C_2(a)$ is a proof of $\rho(a, C_1(a)).$ Second, a proof of $$\exists f \in B^A . \forall x \in A . \rho(x, f(x)) \tag{2}$$ is a pair $(g, D)$ such that $g$ is a function from $A$ to $B$ and $D$ is a proof of $\forall x \in A . \rho(x, g(x))$.

Therefore a proof of (AC) above is a method $M$ which takes the method $C$ which proves (1) and outputs a pair $(f, D)$ which proves (2). Is there such an $M$? It looks like we can take $f = C_1$ and $D = C_2$, and viola the axiom of choice is proved constructively! Well, not quite. We were asked to provide a function $f : A \to B$ but we provided a method $C_1$. Is there a difference? That depends on the exact meaning of "method" and "function". There are several possibilities, see below.

The important thing is that now we can understand what Bishop meant by "a choice is implied by the very meaning of existence". If we ignore the difference between "method" and "function" then under the BHK interpretation choice holds because of the constructive meaning of $\exists$: to exist is to construct, and to construct a $y \in B$ depending on $x \in A$ is to give a method/function that constructs, and therefore chooses, for each $x \in A$ a particular $y \in B$.

It remains to consider whether a "method" $C_1$ taking inputs in $A$ and giving outputs in $B$ is the same thing as a function $f : A \to B$. The answer depends on the exact formal setup that we use to express the BHK interpretation:

Martin-Löf type theory

In Martin-Löf type theory there is no difference between "method" and "function", and therefore choice is valid there (but the exact argument outlined above).

Bishop constructive mathematics

In Bishop constructive mathematics a set is given by an explanation of how its elements are constructed, and when two such elements are equal. For instance, a real number is constructed as sequence of rational numbers satisfying the Cauchy condition, and two such sequences are considered equal when they coincide in the usual sense. This means, in particular, that two different constructions may represent the same element (both $n \mapsto 1/n$ and $n \mapsto 2^{-n}$ represent the real number "zero").

Now, importantly, we distinguish between operations and functions. The former is a mapping from a set $A$ to a set $B$, and the latter a mapping which respects equality (we say that it is extensional). To see the difference, consider the operation from $\mathbb{R}$ to $\mathbb{Q}$ which computes from a given $x \in \mathbb{R}$ a rational $q \in \mathbb{Q}$ such that $x < q$: since $x$ is a Cauchy sequence, we may take $q = x_i + 42$ for a large enough $i$ (which can be determined explicitly once we make our definition of reals a bit more specific). The operation $x \mapsto q$ does not respect equality: by taking a different Cauchy sequence $x'$ which represents the same real, we get a rational upper bound $q'$ which is not equal to $q$. In fact, in Bishop constructive mathematics it is impossible to construct an extensional operation that computes rational upper bounds of reals.

In Bishop constructive mathematics method is understood as operation, and function as extensional operation. Choice is then valid only in some instances, but not in general. In particular, if $A$ has the property that every element is canonically represented by a single construction, then every operation from $A$ to $B$ is automatically extensional, and choice from $A$ to $B$ is valid. An example is $A = \mathbb{N}$ because each natural number is represented by precisely one construction: $0$, $S(0)$, $S(S(0))$, ...

The moral of the story is that the devil hides important details in the passage from informal, pre-mathematical notions to their mathematically precise formulation.