Is there a first order statement about the natural numbers (not nonstandard analysis) such that the truth of the statement is easier to see in a nonstandard model? In other words, do nonstandard models offer "more" mathematical insight in some way about the standard model? Compare this with the use of the compactness theorem to prove certain facts about infinite objects which follows in a straightforward manner from that about finite objects; whereas a direct "mathematical" argument tends to be more tedious (eg. 4 colorability of an infinite planar graph).
[Math] nonstandard models and mathematical theorems
mathematical-philosophymodel-theorypeano-arithmeticreference-request
Related Solutions
There are indeed many proofs of the Compactness theorem. Leo Harrington once told me that he used a different method of proof every time he taught the introductory graduate logic course at UC Berkeley. There is, of course, the proof via the Completeness Theorem, as well as proofs using ultrapowers, reduced products, Boolean-valued models and so on. (In my day, he used Boolean valued models, but that was some time ago, and I'm not sure if he was able to keep this up since then!)
Most model theorists today appear to regard the Compactness theorem as the significant theorem, since the focus is on the models---on what is true---rather than on what is provable in some syntactic system. (Proof-theorists, in contast, may focus on the Completeness theorem.) So it is not because Completness is too hard that Marker omits it, but rather just that Compactness is the important fact. Surely it is the Compactness theorem that has deep applications (or at least pervasive applications) in model theory. I don't think formal deductions appear in Marker's book at all.
But let's get to your question. Since the exact statement of the Completeness theorem depends on which syntactic proof system you set up---and there are a huge variety of such systems---any proof of the Completeness theorem will have to depend on those details. For example, you must specify which logical axioms are formally allowed, which deduction rules, and so on. The truth of the Completness Theorem depends very much on the details of how you set up your proof system, since if you omit an important rule or axiom, then your formal system will not be complete. But the Compactness theorem has nothing to do with these formal details. Thus, there cannot be hands-off proof of Completeness using Compactness, that does not engage in the details of the formal syntactic proof system. Any proof must establish some formal properties of the formal system, and once you are doing this, then the Henkin proof is not difficult (surely it fits on one or two pages). When I prove Completeness in my logic courses, I often remark to my students that the fact of the theorem is a foregone conclusion, because at any step of the proof, if we need our formal system to be able to make a certain kind of deduction or have a certain axiom, then we will simply add it if it isn't there already, in order to make the proof go through.
Nevertheless, Compactness can be viewed as an abstract Completness theorem. Namely, Compactness is precisely the assertion that if a theory is not satisfiable, then it is because of a finite obstacle in the theory that is not satisfiable. If we were to regard these finite obstacles as abstract formal "proofs of contradiction", then it would be true that if a theory has no proofs of contradiction, then it is satisfiable.
The difference between this abstract understanding and the actual Completness theorem, is that all the usual deduction systems are highly effective in the sense of being computable. That is, we can computably enumerate all the finite inconsistent theories by searching for formal syntactic proofs of contradiction. This is the new part of Completness that the abstract version from Compactness does not provide. But it is important, for example, in the subject of Computable Model Theory, where they prove computable analogues of the Completeness Theorem. For example, any consistent decidable theory (in a computable language) has a decidable model, since the usual Henkin proof of Completeness is effective when the theory is decidable.
Edit: I found in Arnold Miller's lecture notes an entertaining account of an easy proof of (a fake version of) Completeness from Compactenss (see page 58). His system amounts to the abstract formal system I describe above. Namely, he introduces the MM proof system (for Mickey Mouse), where the axioms are all logical validities, and the only rule of inference is Modus Ponens. In this system, one can prove Completeness from Compactness easily as follows: We want to show that T proves φ if and only if every model of T is a model of φ. The forward direction is Soundness, which is easy. Conversely, suppose that every model of T is a model of φ. Thus, T+¬φ has no models. By Compactness, there are finitely many axioms φ0, ..., φn in T such that there is no model of them plus ¬φ. Thus, (φ0∧...∧φn implies φ) is a logical validity. And from this, one can easily make a proof of φ from T in MM. QED!
But of course, it is a joke proof system, since the collection of validities is not computable, and Miller uses this example to illustrate the point as follows:
The poor MM system went to the Wizard of OZ and said, “I want to be more like all the other proof systems.” And the Wizard replied, “You’ve got just about everything any other proof system has and more. The completeness theorem is easy to prove in your system. You have very few logical rules and logical axioms. You lack only one thing. It is too hard for mere mortals to gaze at a proof in your system and tell whether it really is a proof. The difficulty comes from taking all logical validities as your logical axioms.” The Wizard went on to give MM a subset Val of logical validities that is recursive and has the property that every logical validity can be proved using only Modus Ponens from Val.
And he then goes on to describe how one might construct Val, and give what amounts to a traditional proof of Completeness.
Under a not unreasonable assumption about cardinal arithmetic, namely $2^{<c}=c$ (which follows from the continuum hypothesis, or Martin's Axiom, or the cardinal characteristic equation t=c), the number of non-isomorphic possibilities for *R of cardinality c is exactly 2^c. To see this, the first step is to deduce, from $2^{<c} = c$, that there is a family X of 2^c functions from R to R such that any two of them agree at strictly fewer than c places. (Proof: Consider the complete binary tree of height (the initial ordinal of cardinality) c. By assumption, it has only c nodes, so label the nodes by real numbers in a one-to-one fashion. Then each of the 2^c paths through the tree determines a function f:c \to R, and any two of these functions agree only at those ordinals $\alpha\in c$ below the level where the associated paths branch apart. Compose with your favorite bijection R\to c and you get the claimed maps g:R \to R.) Now consider any non-standard model *R of R (where, as in the question, R is viewed as a structure with all possible functions and predicates) of cardinality c, and consider any element z in *R. If we apply to z all the functions *g for g in X, we get what appear to be 2^c elements of *R. But *R was assumed to have cardinality only c, so lots of these elements must coincide. That is, we have some (in fact many) g and g' in X such that *g(z) = *g'(z). We arranged X so that, in R, g and g' agree only on a set A of size $<c$, and now we have (by elementarity) that z is in *A. It follows that the 1-type realized by z, i.e., the set of all subsets B of R such that z is in *B, is completely determined by the following information: A and the collection of subsets B of A such that z is in *B. The number of possibilities for A is $c^{<c} = 2^{<c} = c$ by our cardinal arithmetic assumption, and for each A there are only c possibilities for B and therefore only 2^c possibilities for the type of z. The same goes for the n-types realized by n-tuples of elements of *R; there are only 2^c n-types for any finite n. (Proof for n-types: Either repeat the preceding argument for n-tuples, or use that the structures have pairing functions so you can reduce n-types to 1-types.) Finally, since any *R of size c is isomorphic to one with universe c, its isomorphism type is determined if we know, for each finite tuple (of which there are c), the type that it realizes (of which there are 2^c), so the number of non-isomorphic models is at most (2^c)^c = 2^c.
To get from "at most" to "exactly" it suffices to observe that (1) every non-principal ultrafilter U on the set N of natural numbers produces a *R of the desired sort as an ultrapower, (2) that two such ultrapowers are isomorphic if and only if the ultrafilters producing them are isomorphic (via a permutation of N), and (3) that there are 2^c non-isomorphic ultrafilters on N.
If we drop the assumption that $2^{<c}=c$, then I don't have a complete answer, but here's some partial information. Let \kappa be the first cardinal with 2^\kappa > c; so we're now considering the situation where \kappa < c. For each element z of any *R as above, let m(z) be the smallest cardinal of any set A of reals with z in *A. The argument above generalizes to show that m(z) is never \kappa and that if m(z) is always < \kappa then we get the same number 2^c of possibilities for *R as above. The difficulty is that m(z) might now be strictly larger than \kappa. In this case, the 1-type realized by z would amount to an ultrafilter U on m(z) > \kappa such that its image, under any map m(z) \to \kappa, concentrates on a set of size < \kappa. Furthermore, U could not be regular (i.e., (\omega,m(z))-regular in the sense defined by Keisler long ago). It is (I believe) known that either of these properties of U implies the existence of inner models with large cardinals (but I don't remember how large). If all this is right, then it would not be possible to prove the consistency, relative to only ZFC, of the existence of more than 2^c non-isomorphic *R's.
Finally, Joel asked about a structure theory for such *R's. Quite generally, without constraining the cardinality of *R to be only c, one can describe such models as direct limits of ultrapowers of R with respect to ultrafilters on R. The embeddings involved in such a direct system are the elementary embeddings given by Rudin-Keisler order relations between the ultrafilters. (For the large cardinal folks here: This is just like what happens in the "ultrapowers" with respect to extenders, except that here we don't have any well-foundedness.) And this last paragraph has nothing particularly to do with R; the analog holds for elementary extensions of any structure of the form (S, all predicates and functions on S) for any set S.
Related Question
- [Math] How are the two natural ways to define “the category of models of a first-order theory $T$” related
- [Math] Proving Independence of Axioms by Exhibiting Models Which Don’t Satisfy Our Intuition
- [Math] Illustrating Edward Nelson’s Worldview with Nonstandard Models of Arithmetic
- [Math] Modeling in pure math
Best Answer
I view nonstandard models (particularly those produced by ultraproducts) as a convenient way to take limits of discrete structures to obtain continuous structures that capture all of the asymptotic first-order properties of the discrete structure. This of course can also be done by the compactness theorem, but ultraproducts also offer good saturation properties, which tends to make the continuous structure "complete" or "locally compact" in various senses. This allows one to usefully deploy various tools from continuous mathematics (e.g. measure theory, ergodic theory, topological group theory) in a manner which would be inconvenient to do back in the discrete setting, or even in a continuous model that was constructed purely through a compactness argument. I give some examples of this at https://terrytao.wordpress.com/2013/12/07/ultraproducts-as-a-bridge-between-discrete-and-continuous-analysis/ . Some more sophisticated examples include the proof of the inverse conjecture for the Gowers norms, or the inverse theorem for approximate groups; at present the only known proofs of these results proceed via nonstandard methods (though I am sure that if one really wanted to, one could unpack the nonstandard arguments and obtain a fully standard, but very messy, proof of these results).
Model theorists have become quite good at exploiting additional properties of (suitably constructed) nonstandard models, such as the existence of a very large group of automorphisms, which (when combined with sufficiently strong saturation properties) then leads to the ability to describe "generic" elements, "indiscernible" sequences of elements, representatives of various "types", and similar useful gadgets. I'm not so familiar with this aspect of nonstandard methods though.