Nonstandard Analysis – Nonstandard Infinite/Hyperfinite Sum in IST

first-order-logiclogicnonstandard-analysisreal-analysissequences-and-series

TLDR:
If anyone could provide a detailed proof that a sum indexed by an unlimited hypernatural number is well-defined using the axioms of IST, I would greatly appreciate it.

I am studying Nelson's "Radically Elementary Probability Theory" and I am having some trouble. Namely, after introducing the axioms of IST, Nelson proves Theorem 5.3 where the expression
$$
\sum_{i=1}^n x_i
$$

arises (in the statement of the theorem), and he remarks afterwards that "there is no requirement that $n$ be limited". However, this is quite confusing to me, as so far he hasn't defined what an unlimited sum would mean. I have seen posts like this one:
How are infinite sums in nonstandard analysis defined?
that talk about this same issue, but these posts do not specify which framework of NSA they are working in, and, from the references to Keisler's book in the comments, I think they are working in a different framework than Nelson's.

With that said, the above post does mention the transfer principle, which I know is also an axiom of IST. What troubles me, however, is that the specifics of the justification are glossed over, and all of the interpretations I can come up with are disturbing. These are some interpretations I came up with (I will use $\mathbb{N}$ to denote the standard naturals and $\mathbb{N}^*$ to denote the hyperrnaturals; though $\mathbb{N}$ is not a set, this is ok because in the nonstandard setting we will only be using the notation $n \in \mathbb{N}$ as a shorthand for "$n$ is standard"):

  1. We define a function $S: \mathbb{N} \to \mathbb{R}^*$ that is the partial sum function with respect to a priorly fixed hyperreal sequence $(a_n)_{n \in \mathbb{N}^*}$. Then $S$ is defined for every $n \in \mathbb{N}$, and (I'm not sure if this is correct but I'm guessing so) extends to some function defined on $\mathbb{N}^*$.

  2. You begin in the classical universe and fix a sequence $(a_n)_{n \in \mathbb{N}}$. Define its corresponding partial sum function $S$. Then $(a_n)_{n \in \mathbb{N}}$ admits a (unique?) extension to the sequence $(a_n)_{n \in \mathbb{N}^*}$ after applying transfer. Then $S$ also admits a corresponding extension $S^*$.

  3. You begin in the classical universe and define a function $S: \mathbb{N} \times \{\mathrm{all} \ \mathrm{real} \ \mathrm{sequences} \} \to \mathbb{R}$ and then take its transfer.

The reasons I find these disturbing/problematic are as follows:

  1. You need to fix a sequence indexed by the hyper naturals in order to define a function from the classical naturals to the hyperreals. There is a mixing of universes all over the place here and that seems very wrong.
  2. One sequence indexed by the naturals (e.g., $(0)_{n \in \mathbb{N}}$) can easily be extended to different sequences indexed by the hypernaturals (e.g., $(0)_{n \in \mathbb{N}^*}$ and $(\delta_{n, \nu})_{n \in \mathbb{N}^*}$ where $\delta_{n, \nu}$ is the Kronecker $\delta$ with $\nu$ an unlimited hypernatural). This would mean the extension of $S$ cannot possibly be unique, and we have to choose an extension that fits the sequence whose partial sums we are trying to determine exist.
  3. You are defining a function with a set of sets as a domain, and as far as I vaguely understand, this is not allowed in "first-order logic" (I don't know what this means besides that it is some restriction on what quantifiers you can use), and the transfer principle is (or should?) only work for first-order statements.

P.S.
I am studying this book because I am interested in studying Herzberg's book on Stochastic calculus (which also glosses over what this sum would mean). I am not so much interested in logic/model theory, but the existence of this sum is crucial to the ability to define probability measures point wise on infinitesimal meshes, and so I would really like to know why we can do this.

Best Answer

Welcome to Math.SE!

You did not indicate how much background you have in Internal Set Theory (IST) - but based on what you wrote, it seem like you have a few misconceptions about it. Let us start by clearing these up!

When we work in Internal Set Theory, we do not need to consider any $~^\star$-extensions of the structure $\mathbb{R}$, so we don't normally concern ourselves with constructing a set of hyperreals $~^\star\mathbb{R}$, nor hyperextensions for the naturals. We could construct such extensions: Internal Set Theory allows us to do everything that we could ordinarily do in regular old (ZFC) set theory. But we don't need to, because IST provides us with infinitesimals and unlimited numbers "out of the box"!

Internal Set Theory adjoins a new predicate $\mathrm{st}(-)$ to the language of set theory itself, along with a bunch of new axioms (Idealization, Standardization, Transfer) that govern the behavior of the predicate $\mathrm{st}(-)$.

First, you have to understand that while the language of set theory is a first-order language, it of course allows you to define functions with a set of sets as a domain. In fact, every rigorously formulated statement that a mathematician encounters over the course of an ordinary career can be formalized as a first-order statement in the language of set theory. Wikipedia has an article on how mathematical structures are implemented in terms of set theory, if you'd like to learn more about this. In any case, the short version is that first-order logic allows you to quantify over all elements of your domain of discourse: second-order logic allows you to quantify over all subclasses of your domain of discourse. This distinction is quite important when you study the semantics of logic, but not all that important when you just want to work inside the first-order theory of sets (internal or ordinary): the domain of discourse is the "class of all sets" anyway, so if you quantify over elements of your domain, you in fact quantify over all sets. The Transfer principle (which I'll briefly describe below) works only for first-order statements in the language of set theory, but since that includes pretty much any statement, including statements about functions between different sets, you're good to go. There are a lot of subtleties here, but you won't need them unless and until you become more interested in logic and model theory.

Second, you have to understand that the new predicate $\mathrm{st}(-)$ added by IST can be applied to any set-theoretic object whatsoever: it's valid to ask whether the natural number $5$ is standard (it is, $\mathrm{st}(5)$ holds), or whether the set of even numbers $2\mathbb{N}$ is standard (it is, $\mathrm{st}(2\mathbb{N})$ holds), or even whether very complicated objects, like "the set of all possible topological spaces whose underlying set is the power set $\mathcal{P}(\mathbb{R})$ of the real numbers" is standard (it is).

The Transfer axiom/principle of IST essentially states that a property which can be formulated in the language of set theory (i.e. without referring to the new predicate $\mathrm{st}$, and without referring to objects that are defined in terms of this predicate $\mathrm{st}$) holds for every $x$ precisely if it holds for every standard $x$.

Naively, you might think that all elements of the set of natural numbers $\mathbb{N}$ should satisfy the standardness predicate, i.e. that you would have $\forall x \in \mathbb{N}. \mathrm{st}(n)$. The Idealization axiom of Internal Set Theory guarantees that this is false! That is, in IST the set of natural numbers does have nonstandard elements. These nonstandard elements are larger than all the standard elements $0,1,2,\dots$, so we can think of them as unlimited ("practically infinite") natural numbers. But, as far as IST is concerned, such unlimited numbers are perfectly good natural numbers, and we can use them in any situation where we would use any other natural number.

This is the reason we don't bother with the hypernaturals $~^\star\mathbb{N}$ and other $~^\star$-extensions when we work using Internal Set Theory: the axioms give us access to nonstandard elements right inside the usual set of natural numbers $\mathbb{N}$, and these behave analogously to how elements of $~^\star\mathbb{N} \setminus \mathbb{N}$ behave in Robinson/Keisler-style nonstandard analysis.

Chapters 1 and 2 of the book Nonstandard Analysis by Alain Robert gently introduce the axiomatics of Internal Set Theory, explain how nonstandard elements behave, and show how you can construct unlimited nonstandard integers using the Idealization axiom of IST.

In the REPT book, Nelson works in a much smaller fragment of IST. For example, he doesn't use Idealization to construct a nonstandard natural number: he gets one for free using Assumption 3 of Chapter 4. Similarly, he presents the so-called external induction principle simply as Assumption 4 of Chapter 4: normally, you'd get external induction as a consequence of the IST axioms.

Once you grasp the basic idea behind IST, you should be able to read Chapter 4 of REPT directly. There are many other resources as well: e.g. Hrbacek and Katz present a fragment/variant of Internal Set Theory which is sufficient to carry out all mathematical developments in REPT: external induction is proved as Lemma 3.3. there. In principle, I also prove everything you might need in the introduction of my PhD thesis, but I would not recommend using that as a primary resource, since I both assume familiarity with proof theory and model theory, and present the material in extreme generality.


Now we can finally discuss how unlimited sums are actually defined, based on the axioms of Internal Set Theory. It's going to be a detailed proof, as you requested, although there really isn't much detail:

Of course, given the set of real-valued sequences $S := \mathbb{N} \rightarrow \mathbb{R}$, we can define a partial sum function $\Sigma : S \times \mathbb{N} \rightarrow \mathbb{R}$, which takes two arguments, a sequence $a \in S$ and a number $n \in \mathbb{N}$, and yields the sum $$\Sigma(a,n) = a_1 + a_2 + \dots a_{n-1} + a_n.$$

Defining this function $\Sigma$ does not require any of the axioms of Internal Set Theory: we can define it (say, by induction) in the exact same way we would define it in ordinary mathematics. What you get is a perfectly ordinary function $\Sigma$, which you can then apply to any sequence $a : \mathbb{N} \rightarrow \mathbb{R}$ and any element $n \in \mathbb{N}$ to get a partial sum $\Sigma(a,n) \in \mathbb{R}$.

edit: One worry you might have is whether inductive definitions work as usual in IST. Yes: every theorem of ordinary set theory is also a theorem of IST, and since ordinary set theory proves that there exists a unique function $\Sigma(a,n)$ that satisfies $\Sigma(a,0) = a_0$ and $\Sigma(a,n+1) = \Sigma(a,n) + a_{n+1}$ for all natural numbers $n$, so does IST.

As I explained above, thanks to the Idealization axiom of Internal Set Theory (or Nelson's Assumption 3 of REPT Chapter 4) we know that there exists some nonstandard element $\omega \in \mathbb{N}$ (Chapter 3 of Robert's NSA). That's the only result of Internal Set Theory that we need.

While $\omega$ is not standard, it is nonetheless a natural number inside IST. Since we can apply our $\Sigma$ function to any sequence $a$ and any natural number $n$, we can apply it to $n = \omega$ as well, and get a (usually nonstandard) real number $\Sigma(a,\omega)$ as a result. That's it! There's no need to consider any extensions of the $\Sigma$ function to any larger set: the function is defined for each natural number, and every nonstandard natural happens to be a natural number, so $\Sigma$ itself is already defined there. As Nelson says, there is no requirement that $n$ be limited!

Now, $\Sigma(a, \omega)$ is usually going to be a nonstandard real number. However, if you have an $a$ for which you know a standard lower and upper bounds on $\Sigma(a, \omega)$, then you can show (and Robert proves this in Chapter 3) that there is a unique standard real number $r$ so that the difference $|\Sigma(a, \omega) - r|$ is less than any standard positive real number: i.e. $\Sigma(a, \omega)$ is infinitesimally close to some standard $r$.

For example, if your sequence $a$ is the geometric sequence $a_n = 2^{-n}$, then you know that $0 < \Sigma(a, \omega) < 3$, so you can deduce the existence of a standard real number $r$ infinitesimally close to the "infinite" sum $\Sigma(a, \omega)$: in this case, $r = 2$. This is the property that makes a whole lot of infinitesimal analysis work, but that's a story for another time...

Related Question