Let me phrase the argument in somewhat more modern terms:
Goedel constructs a means of encoding any computer program's source code by an arithmetic formula, such that he can prove that, for any program which eventually outputs "YES", Peano Arithmetic (PA) proves the corresponding formula, and for any program which eventually outputs "NO", PA disproves the corresponding formula.
Then Goedel* constructs a computer program with the code "Search through all the possible proofs in PA till you find either a proof or a disproof of the formula corresponding to my source code. If you find a proof first, output 'NO'; if you find a disproof first, output 'YES'. (If you never find either, just keep on searching forever...)" [This is a recursively defined program, in that it refers to its own source code, but that's ok: we understand well how to write up such recursive programs, and even how to compile them to languages that do not directly support recursion. This compilation is essentially what "diagonalization" does]
Now, let p be the formula corresponding to this program's source code. So long as PA either proves or disproves p, this program will eventually output something. But if this program outputs 'YES', then PA must prove p (by the second paragraph) and also disprove p (this is the only way the program ever outputs 'YES'). Similarly, if this program outputs 'NO', then PA must disprove p (by the second paragraph) and also prove p (this is the only way the program ever outputs 'NO'). Thus, if PA either proves p OR disproves p, it necessarily proves p AND disproves p; they're a package deal. So if PA is "complete", then it is inconsistent.
That is the mechanism of the result. It's quite concrete and doesn't depend on any handwavy arguments about meta-statements. It's just a matter of A) knowing how to construct computer programs which can access their own source code, and B) having an appropriate representation of such programs in PA (or whatever system one is interested in), in the sense of the properties of the second paragraph of this post.
[*: I say Goedel, but I actually mean Rosser, five years later; I've chosen to use his approach (which yields a slightly stronger result than Goedel in this context, albeit one which generalizes less) because I think it might be simpler to discuss for now]
The answer is relatively simple, but complicated.
We cannot prove that Peano axioms (PA) is a consistent theory from the axioms of PA. We can prove the consistency from stronger theories, e.g. the Zermelo-Fraenkel (ZF) set theory. Well, we could prove that PA is consistent from PA itself if it was inconsistent to begin with, but that's hardly helpful.
This leads us to a point discussed on this site before. There is a certain point in mathematical research that you stop asking yourself whether foundational theory is consistent, and you just assume that they are.
If you accept ZF as your foundation you can prove that PA is consistent, but you cannot prove that ZF itself is consistent (unless, again, it is inconsistent to begin with); if you want a stronger theory for foundation, (e.g., ZF+Inaccessible cardinal), then you can prove ZF is consistent, but you cannot prove that the stronger theory is consistent (unless... inconsistent bla bla bla).
However what guides us is an informal notion: we have a good idea what are the natural numbers (or what properties sets should have), and we mostly agree that a PA describes the natural numbers well -- and even if we cannot prove it is consistent, we choose to use it as a basis for other work.
Of course, you can ask yourself, why is it not inconsistent? Well, we don't know. We haven't found the inconsistency and the contradiction yet. Some people claim that they found it, from time to time anyway, but they are often wrong and misunderstand subtle point which they intend to exploit in their proof. This works in our favor, so to speak, because it shows that we cannot find the contradiction in a theory: it might actually be consistent after all.
Alas, much like many of the mysteries of life: this one will remain open for us to believe whether what we hear is true or false, whether the theory is consistent or not.
Some reading material:
- How is a system of axioms different from a system of beliefs?
Best Answer
Essentially due to Gödel's Incompleteness Theorems, proofs of the consistency of $\mathsf{PA}$ must involve methods that transcend $\mathsf{PA}$ itself. If one has any doubts about the consistency of $\mathsf{PA}$, those doubts are likely only to be amplified concerning the methods used to prove the consistency of $\mathsf{PA}$. (For example, from $\mathsf{ZF}$, then you can easily construct a model of $\mathsf{PA}$, but the consistency of $\mathsf{ZF}$ is "more debatable" than that of $\mathsf{PA}$, so you haven't really gained anything.)
Gentzen's proof relies on infinitary processes (in particular, induction up to $\varepsilon_0$; see Wikipedia), and may not have been accepted by the Hilbert school (who sought purely finitary proofs of consistency). The ordinal $\varepsilon_0$ is important here because (assuming its consistency) $\mathsf{PA}$ cannot prove that it is well-founded, and it is basically this move that transcends $\mathsf{PA}$.