The question hinges on the best answer to a previous question, when are two proofs the same? I believe that the only satisfactory answer to this earlier question is by considering the construction implicit in the proof. Two proofs are the same when they give the same construction.
To isolate a construction, one must carefully distinguish between a proof of statements of the form "There exists..." and "For all...", and one must also distinguish between a statement and its double negation. For the purposes of practically distinguishing proofs, one does not need to be so pedantic most of the time. For example, the "topological proof" of the infinity of primes is essentially the same as Euclid's proof, because given a collection of primes, when unpacked to its construction, it builds the same number.
One must note here that the problem of deciding when two programs produce the same answers is undecidable, even more so when the programs have access to oracles, which is necessary sometimes. Further complicating matters is the fact that certain programs are only superficially different to the eye, but are essentially the same. Nevertheless, I think this is a useful heuristic, which has a chance of having a precise counterpart.
The construction in Godel's theorem is often obscured by the heavy coding involved. Since today, coding is standardized in computer science, I prefer to state the construction explicitly as a computer program, instead of as a coded statement of first-order logic. Two proofs are the same when they construct the same computer program.
There are exactly three types of unpacked proofs of Godel's theorem and related results, as far as I know. To save typing, a program P "runs" iff it does not halt.
TYPE I: self referential pi-0-1 statements (statements about the non-halting of a certain computer program)
GODEL_1: To prove Godel's theorem Godel's way (as clarified by Turing and Kleene), given an axiomatic system S whose deduction system is computable, you construct the program GODEL which does the following:
- It prints its own code into a variable R. (This is possible since you can write quines, and make quining into a subroutine)
- It deduces all consequences of S, looking for a proof in S of the statement "R does not halt" ("R runs").(This is a statement of arithmetic of the form "forall n F^n(R) is non-halting", where F is a primitive recursive instruction set for any computer you care to code up).
- If it finds this statement, it halts.
The statement G--- "GODEL runs" is true precisely when S does not prove it. So G states "S does not prove G". The self reference is obvious in the first step of the program. This is equivalent to Godel's original construction.
From the construction, you can read off the requirements on the axiom system. In order to be sure that GODEL halting leads a contradiction, the axiom system has to be able to prove every statement of the form "program P leads to memory state M after time t" for all integer times t, and for all programs P. Each of these statements is a finite computation, a sigma-0-1 statement, so the axiom system must be able to prove all true sigma-0-1 statements (or even just a subset of these rich enough to allow a computer to be embedded in the language).
GODEL_2: Note that if S is inconsistent, it proves every statement, including "GODEL runs", so "S is inconsistent" implies "GODEL halts". "GODEL halts" also implies "S is consistent", if S can prove that it is sigma-0-1 complete. So the unprovability of "GODEL halts" is tantamount to the unprovability of consis(S).
But S can (falsely) prove "GODEL halts", without any contradiction, so long as GODEL never actually halts. This is saying that it is possible for an axiom system to prove its own inconsistency without actually being inconsistent, just by telling lies about computer programs. The assumption that S is omega consistent (or even just sigma-0-1 sound), means that it does not prove "P halts" unless P actually halts. So the same construction of GODEL proves the second incompleteness theorem as stated by GODEL, an omega-consistent system (or a sigma-0-1 sound system) cannot prove its own consistency.
The proofs of Godel's theorem which go through the halting problem all give this construction.
ROSSER: The program ROSSER is just a slight modification of GODEL. ROSSER does this:
- prints its code into a variable R
- looks in S for a proof of 1. "R prints to the screen" or 2. "R does not print to the screen".
- If it finds 1. It halts without printing, if 2, it halts after printing "hello".
Now note that a consistent S cannot prove 1 nor 2, because either way, there is a halting computation that contradicts the statement. So a consistent S is incomplete. If we call the statement "ROSSER prints to the screen" by the name R, and its negation 2 by the name "notR", then "ROSSER does not print" is iff equivalent to "S proves R before notR", which is the standard gloss for ROSSER's construction.
ROSSER's statement is different than GODEL statement, because the statement "ROSSER does not print" is not equivalent to the statement "S is consistent". But since the slightly different statement "ROSSER does not halt" actually is equivalent to "S is consistent", ROSSER's construction includes GODEL's construction in a simple way.
Here are some simple modifications which also prove Godel's theorem:
PROOF_LENGTH: Given a provable statement of length L bytes in an axiomatic system S, there is no computable function of L, f(L), which bounds the length of the proof of L (relatively short theorems can have enormously long proofs).
construct PROOF_LENGTH to do the following:
- Print its own code into a variable R
- look through all deductions of S of length up to f(|R|) bytes for a proof of "R prints ok"
- If you finds it, halt without printing
- If not, print "ok" and halt.
In this case, the construction is clarified with a gloss: suppose f(L) exists, then you can decide the halting problem by running through all proofs of length f(|"P halts"|) for a proof of "P halts". If you don't find it, then P doesn't halt.
This is also a proof of Godel's theorem, since if S is complete, then it will decide all statements of the form "P halts", and then you can compute the function f which is the length of the proof of the statement. But the program constructed is essentially the same as GODEL (actually ROSSER, in the version I gave here).
But the explicit construction does give you an important corollary: if you assume S is consistent, then just by the form of PROOF_LENGTH, you can see that PROOF_LENGTH has to print "ok" independent of the function f, since if it does not, this means it has found a proof that it didn't. So the assumption of consis(S) will collapse this f dependent enormously long proof to a short f-independent proof of the same statement.
This construction is just a finitary version of the original GODEL program, and this theorem is called the GODEL speedup theorem. The assumption of consis(S) reduces the length of proofs of certain statements by an amount greater than any computable function of the length.
LOB: Given an axiomatic system S, consider the program LOB which, given statement A, does the following:
- Prints its code into R
- Deduces consequences of S, looking for "R halts implies A"
- if it finds this theorem, it halts.
"LOB halts" only if S proves "LOB halts implies A", and then S also proves "LOB halts", so it proves A by modus ponens, so LOB halts iff A. But "LOB halts" is equivalent to "(S proves LOB halts) implies A", and therefore to "(S proves (S proves A) implies A)". Therefore, S proves "S proves ((S proves A) implies A) iff (S proves A)". This theorem can be repackaged into an infinite sequence of ever more obscure statements, by replacing "LOB halts" with its different equivalent forms (some of which contain itself), and eventually closing the recursion. The full set of Lob statements is generated by a simple recursive grammar.
LOB's theorem does not prove Godel's theorem, but it extends it. The proof is of a similar kind.
TWEEDLEDEE and TWEEDEDUM: consider two programs as follows:
TWEEDLEDEE:
- prints TWEEDLEDEE's code into ME, and TWEEDLEDUM's code into HE
- looks for 1. ME runs 2. HE runs
- if it finds 1. it halts, if it finds 2. it prints "tweedle-dee-dee!" and goes into an infinite loop.
TWEEDLEDUM:
- prints TWEEDLEDUM's code into ME, and TWEEDLEDEE's code into HE
- looks for 1. ME runs 2. HE runs
- If it finds 1. it halts, if it finds 2. it prints "tweedle-dee-dum!" and goes into an infinite loop.
These give a kind of splitting theorem for axiomatic systems which satisfy the hypotheses of GODEL's theorem. "DEE runs" and "DUM runs" are both unprovable in S, since proving either one leads to a contradiction. "consis(S)" implies "DEE runs & DUM runs", and conversely "DEE runs & DUM runs" implies consis(S).
So if S is inconsistent, then one of TWEEDLEDEE or TWEEDLEDUM has to halt But which one? This is not decidable in S. That is, S+"DEE runs" is a theory which is strictly stronger than S, since it proves "DEE runs", but is weaker than S+"consis(S)" because it cannot prove "DUM runs".
To prove this, note that S proves "DEE runs or DUM runs", i.e. "DEE halts implies DUM runs". So if S also proved "DEE runs implies DUM runs", it would prove plain old "DUM runs", which is impossible.
The reason for the spurious print statement is just to make absolutely sure that the programs DEE and DUM, which are so similar, don't end up identical, which would wreck the proof (this subtlety is hard to see if you don't unpack the construction into an explicit program, but it is also easy to avoid by using different variable names, or extra spaces, or whatever).
This construction is strictly stronger than GODEL's. It shows that for any sound system S, the implication "DEE runs implies DUM runs" is unprovable. The construction provides a proof of Godel's theorem, although it is similar to ROSSER (The statement DEE halts is provably NOT the negation of "DUM halts", that's the whole point)
I wondered if this construction was in the literature for a long time. I recently ran across it in "The Realm of Ordinal Analysis" by Michael Rathjen (proposition 2.17 on page 14). He couldn't find it in the rest of the literature, but the methods are sufficiently well known (and sufficiently close to Rosser's) to make it folklore. But, as emphasized by Rathjen, the result is significantly stronger than the usual theorems.
TWEEDLE_N: To push this further into uncharted territory, consider the infinite sequence of programs TWEEDLE_N (where N is an integer)
TWEEDLE_N:
- loops over M, printing the code of TWEEDLE_M into a variable R(M)
- Deduces consequences of S, looking for a theorem of the form "TWEEDLE_M runs" for some M
- If it finds this theorem, and M=N, it halts. If M != N, it goes into an infinite loop.
It is easy to see that either all TWEEDLE-N's run, or exactly one of them halts, something which S can prove, because steps 1+2 (which must be run simultaneously in two threads) are the same for all the programs. But S cannot prove that any single one of them runs.
To prove this, note that if there is an effective list of programs A_N (like the TWEEDLE's), you can make a program MERGE(A_N) which generates and runs all of the programs on parallel threads and halts exactly when any one of them does. Then S proves that either TWEEDLE_k runs or MERGE(A_r (r!=k)) runs. That is, TWEEDLE_k and MERGE(all the others) form a TWEEDLEDEE/TWEEDLEDUM pair. This means that it cannot prove that one runs implies the other runs.
The result is that for any computable partition of the TWEEDLE's into two disjoint subsets A and B, S cannot prove that the TWEEDLE_A's run implies the TWEEDLE_B's run, although consis(S) proves that all the TWEEDLE's run. The theories "S+all the TWEEDLE_A's run" are sound theories strictly between S and S+consis(S) in terms of Pi-0-1 content--- they prove new correct theorems about the non-halting of computer programs, but they are weaker than S+consis(S) (and weaker than each other in a way described by the partial order of set containment).
I like this theorem, because it is a proof which is very dastardly to translate to more traditional logic language. I think that computational language is more natural for these results.
I could go on making more complicated self-referential proofs (and I think this is an interesting thing to do, they all prove somewhat different things), but I will stop here to consider non self-referential proofs, which work at a higher level of the arithmetic hierarchy.
TYPE II: these prove that there exist total functions which are not provably total. The statements in this case are pi-0-2, statements about the totality of some computable function.
FASTER_GROWTH: Given axiomatic system S, consider all computable functions f from the integers to the integers that are proven to be total (that is, which halt for all arguments). Now construct the program FASTER_GROWTH(n) which does the following:
- Lists the first n functions which are provably total, and computes their value at position n.
- returns the biggest value at n, plus 1.
If S is sound for pi-0-2 statements, then there are infinitely many provably total functions, and FASTER_GROWTH halts at every input. Further, FASTER_GROWTH is eventually bigger than any function provably total in S. So "FASTER_GROWTH is total" is an unprovable true theorem.
The function FASTER_GROWTH is constructed entirely from other functions which are not equal to itself. The requirement on the theory is that when it proves a function is total, it is telling the truth, otherwise FASTER_GROWTH will get stuck in an infinite loop at some point. This is the pi-0-2 soundness. The pi-0-2 soundness proofs generally construct this type of thing, when unpacked.
The most common abbreviated form of this argument runs as follows: given an axiomatic system S, diagonalizing against all provably total recursive functions in the theory gives a total recursive function which the theory cannot prove is total. This argument is folklore.
Type II Godel theorems provide a different way to strengthen the axiom system, by adding the statement of the totality of FASTER_GROWTH. This statement implies consistency of S, but is strictly stronger, since consistency is not enough to ensure FASTER_GROWTH is total (you need some soundness).
TYPE III: nonconstructive theorem about a large class of statements, which do not provide an explicit unprovable statement, and so cannot be used to step up the heirarchy of systems.
BOOLOS: There is no computer program which will output the true answer to statements of the form "Integer N can be named using k bytes or less worth of symbols of Peano Arithmetic"
write program BOOLOS:
1. loops over all integers N, looking for the first N which requires more than M symbols to name, where M is the length of the symbols describing the output of BOOLOS, translated to arithmetic.
2. prints out N.
The contradiction means that BOOLOS does not work. Boolos is not so great, because it isn't focused on a particular system, but it's the same basic idea as...
CHAITIN: Which replaces the notion of definability with Kolmogorov complexity, which is definability by an algorithm. Write the program CHAITIN to do the following:
- List all proofs of S, looking for "the Kolmogorov complexity of string Q is greater than N" (where N is the length of CHAITIN)
- Print string Q
Now if S ever proves that the Kolmogorov complexity of any string is greater than the length of CHAITIN, then CHAITIN will make S into a sigma-0-1 liar (inconsistent). This proves that there is a completely effective bound on the maximum provable Kolmogorov complexity of any string. (this was given previously as an answer).
There is an infinite list of true sentences of the form "The Kolmogorov complexity of Q is N", since there are infinitely many strings and only finitely many programs of length less than N. But only a finite number of these theorems get decided by any given axiom system S. This is a less explicit proof, because you can't be sure which strings are unprovably complex, so there isn't a natural axiom to add on to strengthen the system.
The statement "the Kolmogorov complexity of Q is N", translated to Arithmetic, is forall P, there exists N, ((F^N(P) is a halted state with output Q) implies |P|>n), so that it's Pi-0-2.
Now to identify which proofs is what type:
- Self-referential sentence proofs--- type I (Godel, Rosser, Kleene, Post, Church, Turing, Smullyan, popular works)
- Epsilon-naught induction proofs--- these are type II, but specific to Peano Arithmetic. The general version is the one presented above (Kripke's proof, and Paris-Harrington, Goodstein, Hydra). The version they give is that the limit ordinal of all recursive provable ordinals is recursive but not provably recursive, but this is a type II argument.
- Jech/Woodin Set theory model proof--- despite all its elegance and generality, the proof is type 1 when formulated computationally. I will elaborate below
- Chaitin/Boolos--- type III. I don't know any other type IIIs.
By the way, I agree with Sergei that finite axiomatizability (although emphasized by Putnam for some reason) is not so important. That property depends on exactly how you choose your axioms. The completeness theorem is strong enough to get a general computation from only finitely many axioms. The proof of impossibility of finite axiomatization (when it holds) is that the theory is self-reflecting, it can prove the consistency of any finite fragment (this is true in PA, because PA proves the consistency of induction restricted to level N in the Arithmetic Hierarchy), and the axiomatization is weak, in that finitely many axioms are stuck in some finite fragment no matter how many times you use them. Self-reflection is interesting, but not that relevant for the incompleteness theorem.
To see that the Jech/Woodin proof is really a type I proof in disguise, it is important for the purpose of unpacking the construction to supplement the set theory with an effective procedure to give computational meaning to the models. This is just first order logic and the completeness theorem, as Andreas Caicedo states in the introduction. (To be continued --- I am too tired to avoid wrong statements--- sorry for the excessive length)
Best Answer
Two geometric proofs are given below. Both proofs start off as in Euler's proof, by considering the image $C_{2}$ of a great circle $C_{1}$ under the motion. In proof (1) this is used to construct a non-great circle which must be mapped onto itself due to a certain orientation property that is preserved - the axis of that circle is then the Euler Axis. In proof (2) it is shown the final displacement of the great circle can be achieved via a composition of two $ 180^\circ $ axial rotations which then gives the Euler Axis as the normal to the plane containing these two axes. For a non-zero motion the Euler Axis must be unique since it implies there exist exactly two fixed points, namely its endpoints - with all the other points rotated by a common angle not a multiple of $ 360^\circ $.
The following terms are used :
The lemmas cover the simple special cases and define the notion of 'orientation' used in Proof (1).
Lemma 1
A motion of a sphere about its center $O$ which leaves a point $P$ on the sphere fixed is equivalent to an axial rotation about $OP$. Hence the antipode $P'$ of $P$ is fixed also, and if the motion is non-zero $P$ and $ P' $ are the only fixed points.
Proof
There are no possible final positions of the sphere in which $ P $ is fixed, other than axial rotations about $ OP $ from the original position, since with $ P $ fixed the situation of the sphere is constrained from every other possible motion. The antipode $ P' $ is the opposite end of this axis and hence is fixed also. For a non-zero motion the angle of axial rotation cannot be a multiple of $ 360^\circ $, thus ALL points other than $ P $ and $ P' $ must be moved.
Lemma 2
Given a motion $M_{1}$ of a sphere $ S $ about its center $ O $, then a second motion $M_{2}$ which places a circle $ C $ of $ S $ (great or non-great) identically to $M_{1}$ is equal to $M_{1}$.
Proof
No other possible final position of $ S $ than that of $M_{1}$ can have $ C $ placed completely 'correctly' because the sphere is completely constrained by this criteria - for, once the final positions of all the points of a circle on a fixed center sphere have been determined the final positions of all the other points of the sphere have been determined also. Thus $M_{2}$ must equal $M_{1}$.
Lemma 3
A motion of a sphere about its center $ O $ which overlays a circle $ C $ (great or non-great) onto itself in some manner is equivalent to an axial rotation.
Proof
(i) If $ C $ is non-great then as in Lemma 1 the sphere is constrained so no net displacement other than a rotation about the circle's axis is possible.
(ii) If $ C $ is a great circle then it must either be :
(a) overlayed the 'same way up', in which case the same argument as (i) applies, or
(b) overlayed but 'flipped over'. Consider an arbitrary point $ P $ on $ C $ and its image $ P' $ under the motion ($ P' $ may equal $ P $), as in the 'plan view' of Fig 1 :
A $180^\circ$ rotation $\phi$ about axis $ D $ of symmetry of $ P $ and $ P' $ places $ C $ the 'right way up' and puts $ P $ onto $ P' $. This must then place all the other points of $ C $ in the correct position, and so by Lemma 2, $\phi$ is equivalent to the original motion.
Lemma 4
Any motion of a sphere about its center $ O $ in which a diameter is flipped is equivalent to a $ 180^\circ $ axial rotation.
Proof
This causes the great circle of the diameter to be flipped over onto itself, and thus by Lemma 3 case (ii)(b), the result follows.
Lemma 5
Given two non-diametrical points $ A $ and $ B $ on a sphere $ S $ of radius $ R $, then if $ d $ is the straight-line distance $ AB $, the circles on the sphere which contain $ A, B $ are :
(i) a unique minimal radius circle of radius $ r = d/2 $,
(ii) a unique maximal radius great circle of radius $ r = R $,
(iii) for every intermediate radius $ r \in (d/2, R) $, exactly 2 circles of radius $ r $.
Proof
In the 'Hoopla Construction' in Fig 2 below, $ A $ and $ B $ are viewed at $ D $, with $ A $ in front of $ B $. The set of circles on $ S $ containing $ A, B $ corresponds to the set of planes through the axis $ AB $, as they cut $ S $, such as $ \Gamma $ and $ \Delta $. $ \theta = 0^\circ $ gives case (i), $ \theta = 90^\circ $ gives case (ii), and $ \theta \in (0, 90^\circ) $ gives case (iii), with $ r = \sqrt{ R^2 - l^2 cos^2 \theta } $ (an increasing function of $\theta$).
Definition
Given two non-diametrical points $ A $ and $ B $ on a circle $ C $, 'orientation of $A, B$ on $C$' is either CW or ACW according to the sense of the minor arc from $ A $ to $ B $.
With this definition : (i) 'orientation of $A, B$ on $C$' flips when we view from the other side of $C$, (ii) the orientation is undefined for diametrical points $A, B$ of $C$, and (iii) the 'orientation of $B, A$ on $C$' is opposite from the 'orientation of $A, B$ on $C$'.
Lemma 6
Given any non-great circle $C$ on a sphere $S$ and two non-diametrical points $A, B$ of $C$, the orientation of $A, B$ (as viewed from 'non-$O$' side of circle $C$, ie the 'outside' of $S$) is preserved after any motion of $S$ about $O$.
Proof
Center $O$ of sphere never crosses or touches the plane of circle $C$, so circle $C$ is always being viewed from the same side, and so as $A, B$ are fixed onto $C$, their orientation on $C$ remains the same.
Lemma 7
If two non-diametrical points $A, B$ on a sphere $S$ of radius $ R $ lie on two distinct circles of common radius $ r $ on the sphere, then $A, B$ (viewed from non-$O$ side) have opposite orientations on these respective circles.
Proof
From Lemma 5, the two distinct circles of same radius implies case (iii), so the circles are non-great.
Thus from the 'Hoopla' diagram of Fig 2, with $A, B$ seen at $ D $ with $ A $ in front of $ B $, we have $\theta \in (0, 90^\circ)$.
The circle in plane $ \Gamma $ gives $A, B$ with orientation ACW, whilst the other circle in the mirror image plane $ \Delta $ gives $A, B$ with orientation CW.
Lemma 8
Given two diameters $ L $ and $ M $ of sphere $S$, then the motion which is the composition of a $ 180^\circ $ rotation about $ L $ followed by a $ 180^\circ $ rotation about $ M $ is equivalent to a single axial rotation.
Proof
The case $ L = M $ is trivial as the composition is a zero motion.
Otherwise consider the great circle $C$ defined by the plane containing $ L $ and $ M $, and let $ P $ and $ P' $ be the poles of $C$, as in Fig 3.
The rotation about $ L $ flips $ P $ and $ P' $, as does the rotation about $ M $. Hence the composition leaves $ P $ fixed. By Lemma 1 the result then follows, with the axis being the normal to the plane containing $ L $ and $ M $.
Proof 1
Suppose great circle $C_{1}$ is mapped onto great circle $C_{2}$. Assume planes $C_{1}$ and $C_{2}$ do not coincide (otherwise Lemma 3 completes the proof).
Let $C_{1}$ and $C_{2}$ intersect along a diameter $ BF $ (the 'line of nodes'), as shown in Fig 4.
Since $ B $ lies on $C_{2}$, it must have been mapped from some point $ A $ on $C_{1}$. Assume $ A \neq F $ (otherwise the proof follows from Lemma 4), and $ A \neq B $ (otherwise the proof follows from Lemma 1). $ A $ is shown on the left of $ BF $ in Fig 4 - if it was on the right, we could rotate the diagram around $ 180^\circ $ about $ BF $ so $ A $ is on the left. The dihedral angle $ \delta \in (0, 180^\circ) $.
Let $ \Omega \in (0, 180^\circ) $ be the angle $\angle A\widehat{O}B$.
$ B $ also lies on $C_{1}$ so it maps to some point $ E $ on $C_{2}$. So from the rigid body motion of $C_{1}$, angle $\angle B \widehat{O} E = \Omega$, and $\mbox{chord } |AB| = \mbox{chord } |BE| $ (on $C_{1}$, $C_{2}$ respectively). $E$ is shown above $C_{1}$ in Fig 4, but the same argument as below applies if it is below.
Also plane $A, O, B = \mbox{plane } C_{1}$, and plane $B, O, E = \mbox{plane } C_{2}$.
$ A, E, B $ cannot be collinear because that would imply $ E $ to be in plane $C_{1}$, so the plane $C_{2}$ defined by $ B, O, E $ would then be in plane $C_{1}$ - a contradiction.
Thus $ A, E, B $ define a unique plane, containing 3 distinct points of sphere $S$. That plane cannot pass through $O$ since then all of $ A, E, B, O $ would lie in the same plane, again implying $C_{1}$, $C_{2}$ coincident - a contradiction. Let the non-great circle defined by this plane be $C$, and let its image be $ D $.
We show $ C = D $. Firstly note that although $C$ is a smaller radius circle than $C_{1,2}$, chords $AB$ and $BE$ can't be diameters of $C$ because that would imply $A = E$ - a contradiction - so the orientations below are well-defined. Consider the points $ B, E $ which lie on $C$. They must also lie on $ D $, being the image of $A, B$. But (viewing from non-$O$ side) :
orientation of $ B, E $ on $C$ = orientation of $A, B$ on $C$,
because the non-diametrical equal length chords $AB$ and $BE$ of $C$ subtend the same angle within $C$, and these chords lie to either side of point $B$ by virtue of $A \neq E$.
And secondly, considering the rigid motion taking $C$ to $ D $ :
orientation of $ B, E $ on $ D $ = orientation of $A, B$ on $C$,
by Lemma 6.
So $ B, E $ have the same orientation on circles $ C, D $. But by Lemma 7, as $ C, D $ have common radius, this means $ C = D $, from which the proof now follows from Lemma 3 case (i), the Euler Axis being the axis of circle $C$.
Proof 2
View $C_{1}$ and $C_{2}$ as shown in Fig 5. Cases $ \theta = 0^\circ $ and $ \theta = 90^\circ $ follow from Lemma 3 case (ii), so assume $ \theta \in (0, 90^\circ) $.
$C_{1}$ can be made to overlay $C_{2}$ by $ 180^\circ $ rotation about $ L $ or about $ M $.
The first of these places upper side of $C_{1}$ onto lower side of $C_{2}$, while the second places the upper side of $C_{1}$ onto the upper side of $C_{2}$.
Choose whichever of these results in $C_{1}$ being overlayed onto $C_{2}$ the 'wrong way up'. Then by Lemma 3 case (ii) (b) a $ 180^\circ $ rotation about some axis within $C_{2}$ places $C_{1}$ exactly in the 'correct' position of $C_{2}$.
Thus we have achieved the correct final position for $C_{1}$ by a succession of two $ 180^\circ $ axial rotations, and thus by Lemma 8 and Lemma 2 the proof follows.