Mathematical Terminology – Lemma vs. Theorem

terminology

I've been using Spivak's book for a while now and I'd like to know what is the formal difference between a Theorem and a Lemma in mathematics, since he uses the names in his book. I'd like to know a little about the ethymology but mainly about why we choose Lemma for some findings, and Theorem for others (not personally, but mathematically, i.e. why should one classify a finding as lemma and not as theorem). It seems that Lemmas are rather minor findings that serve as a keystone to proving a Theorem, by that is as far as I can go.

NOTE: This question doesn't address my concern, so please avoid citing it as a duplicate.

Best Answer

First off there is no "formal difference" between a theorem and a lemma. Formally, if you view mathematics from the perspective of set theory (ZFC), you must conclude that anything commonly called a "lemma" in the literature is by definition "a theorem of ZFC," i.e. a finite sequence of true formulas of ZFC which flow logically from one formula to the next ending on a formula representing the statement of the theorem.

So, lemmas are invoked with literary freedom that it be understood that they really are theorems, but somehow "little ones". But why bother?

A lemma comes typically in two forms: (i) a useful trick or (ii) a technical step in a proof. Let me demonstrate some examples.

A useful trick in real analysis is called "Fatou's Lemma," which helps us interchange limit operations and integrals. Very roughly, it states that

"if $\displaystyle\lim_{n \rightarrow \infty} f_n(x) \rightarrow f(x)$ for all $x$, then

$$\int \lim f_n(x) dx = \int f(x) dx \leq \lim \displaystyle\int f_n(x) dx ,"$$

which, it turns out, becomes "half of the work" in proving a lot of very useful and frequently used inequalities like the Monotone Convergence Theorem and Lebesgue's Dominated Convergence Theorem. On its own, Fatou's Lemma is not so remarkable, and it quickly becomes a minor routine step in very major and fundamental theorems in real analysis -- this is why it is itself a lemma, not a theorem.

Another good example of a theorem of the (i) type is "Zorn's lemma". Zorn's lemma is a technical statement about partially ordered sets but it is invoked frequently in proofs studying ideals in ring theory (I'm sure it has many more uses but I'm unfamiliar with them).

The strange thing about Zorn's lemma is that it is logically equivalent to the Axiom of Choice, i.e. from Zorn's lemma you can prove the Axiom of Choice and from the Axiom of Choice you can prove Zorn's lemma. In other words, if you studied the axioms of set theory but instead of assuming the axiom of choice you assumed Zorn's Lemma as an axiom (let's call this Zorn's Axiom for now), then you could eventually deduce the Axiom of Choice (perhaps Lemma of Choice?) as a consequence of Zorn's Axiom. So Zorn's lemma is a lemma ONLY BECAUSE we assume the Axiom of Choice rather than Zorn's lemma as an axiom of standard set theory: it is a lemma only because of how we choose to organize mathematics.

A type (ii) lemma is something highly technical that, if proven in the middle of the theorem you really are trying to prove, you may have difficulty getting back on track since it takes too long. This happens ALL THE TIME in mathematics. Here is an example I came across recently from the proof of Dirichlet's theorem on arithmetic progressions in Tom Apostol's "Introduction to Analytic Number Theory":

Theorem (Dirichlet's Theorem): If $h$ and $k$ are relatively prime integers, then there are infinitely many primes in the arithmetic progression $\{hn+k \colon n = 1,2,3,\ldots\}$.

To prove this theorem, he proves a number of lemmas, such as

Lemma 7.4: If $x > 1$ we have

$$\displaystyle\sum_{p \leq x; p \equiv h (mod k)} \frac{\log p}{p} = \frac{1}{\phi(k)} \log x + \frac{1}{\phi(k)} \displaystyle\sum_{r=2}^{\phi(k)} \overline{\chi_r(h)} \displaystyle\sum_{p \leq x} \frac{\chi_r(p)\log p}{p} + \mathscr{O}(1),$$

and

Lemma 7.5 For $x > 1$ and $\chi \neq \chi_1$, we have

$$\displaystyle\sum_{p \leq x} \frac{\chi(p)\log p}{p} = -L_{\chi}'(1) \displaystyle\sum_{n \leq x} \frac{\mu(n)\chi(n)}{n} + \mathscr{O}(1),$$

and so forth. He has, in total, about 5 or 6 such lemmas which are steps in the proof of the theorem stated above. The reason these things, while complicated and substantial (far more than Fatou's lemma!), are called lemmas, is that if you began proving Dirichlet's Theorem and proved these in the middle of that proof, you would easily get lost.

So really, what a lemma is to you is whatever you want it to be. It is a word that exists in our vocabulary that is part of the proper name of a concept like Zorn's lemma or it can be simply a word to promote a more readable exposition.

Related Question