As Noah Schweber points out, results like the conservativity of
expansions-by-definitions give us some confidence. But I think there's
more to it than that. On the other hand, as always with
informal-to-formal matters (e.g., Church's Thesis),
we cannot expect to prove that formalization is always possible.
First a couple of analogies. In topology, one often
runs into claims about continuity based on geometric
intuition (e.g., the "house with two rooms", Hatcher p.4).
No one doubts that these could be reformulated as
$\epsilon$-$\delta$ proofs, but nobody would want to
carry out the rewriting. In computability theory,
one can tell that certain functions are computable
without actually writing programs, let alone constructing
Turing machines.
In these and similar cases, the process of formalization
is routine without being algorithmic. Once you've gained
experience, you know what tools to deploy, and what pitfalls
to worry about. To quote from an MO answer to the question
of formalizing Wiles' proof in Peano Arithmetic:
Experts can "smell" the intrusion of a strong axiom from a long way off; for example, the primary reason that the Robertson-Seymour graph minor theorem requires more than PA is that it invokes a highly sophisticated argument by induction at a crucial point. The proof of FLT doesn't give off this sort of odor anywhere. Of course, a smell test is not a proof, but the smart money is that if we run into difficulties trying to prove FLT in PA, it won't be because we need some strong arithmetic axiom.
We should not forget that both the predicate calculus and the ZFC axioms
arose from research programs of formalizing mathematical reasoning.
For the predicate calculus, the Completeness Theorem gives us
assurance that nothing was "left out". For ZFC, we have no comparable
formal result, but the intuition behind the Cumulative Hierarchy
is very suggestive.
Drilling down a bit, what makes this formalization process
seem routine? I think it's because it is itself a "drilling down".
Let's look at your example, the proof in ZFC of Con(PA).
If one had to formalize it, one would start by breaking it into
parts. At the top-level, the proof consists of two parts:
- The Soundness Theorem: if a theory $T$ has a model whose
domain is a set, then it is consistent.
- PA has such a model in ZF, namely the finite ordinals.
What are key issues in formally proving the Soundness Theorem?
They split into syntax and semantics. For syntax, all the
pieces are "finite combinatorial structures": proofs are
sequences of formulas, formulas sequences of symbols, etc.
We have a great deal of confidence that ZFC can handle such
things without even breathing hard.
Semantics seems prima facie like a bigger deal. But here
again we have a roadmap: Tarski's inductive definition
of truth (aka satisfaction), a piece of ordinary mathematics,
not particulary convoluted. General results of ZFC handle
the rather vanilla inductions and recursive definitions.
I had occasion to write up a more detailed treatment; in it,
I made free use of the ability of ZF to handle trees formally.
If that were challenged, one would drill down into the details
with resignation, perhaps, but without trepidation.
In Kaye's Models of Peano Arithmetic, he dealt
with the more difficult problem of formalizing certain
notions of truth inside PA. He starts with
the sentence, "This is the chapter that no one wanted to have to write..."
Nonetheless, working out the details at no point required
real inspiration, just persistence. (The one non-obvious trick,
representing sequences of arbitrary length in PA, was provided
by Gödel long ago.)
Ultimately, drilling down should reach bedrock: something formalizable
in ZFC. Of course you can't prove that it will, short of doing it. But it
would be quite surprising if it didn't.
All that said, surprises can occur. For example, a theory I call
ZFC$\neg\infty$ -- that's ZFC with the axiom of infinity replaced with its
negation -- is, according to folklore, "equivalent" to PA. But when
Kaye and Wong looked into the details, they found that for a strong
notion of equivalence, ZFC$\neg\infty$ must be supplemented with
another axiom. (See here for more details.
The Kaye-Wong paper is "On Interpretations of Arithmetic and Set Theory",
Notre Dame Journal of Formal Logic, 2007.)
More examples.
Much of modern category theory uses Grothendieck Universes;
these are usually justified by appealing to a large cardinal
axiom. Finally, as I understand it, it is still not settled if
Wiles' proof can be formalized in PA. The answer to this question remarks,
"Wiles' original proof of FLT uses set-theoretical assumptions unprovable in Zermelo-Fraenkel set theory with axiom of choice", but goes on to quote a paper saying "certainly much less than ZFC is used in principle, probably nothing beyond PA, and perhaps much less than that."
Best Answer
There is no issue with handling the existence and uniqueness of base-$b$ representations for an arbitrary base $b$ in ZF or even in Peano arithmetic. Existence and uniqueness of base-$b$ representations says that every non-negative integer $n$ can be written uniquely in the form
$$n = \sum_{i=0}^k d_i b^i$$
where $0 \le d_i \le b-1$ and $d_k \neq 0$ (unless $n = 0$). Furthermore we have $d_i = \left\lfloor \frac{n}{b^i} \right\rfloor - b \left\lfloor \frac{n}{b^{i+1}} \right\rfloor$.
To say all of this in ZF you think of the sequence $d_i$ of digits as a function $\mathbb{N} \to \mathbb{N}$ and define the meaning of $\sum$ by induction in the obvious way. Then the above statement is also proven by induction.
Generally speaking if you want to talk about strings of elements of any set $S$ in ZF you code them as functions $\mathbb{N} \to S$. There is no need to go outside of set theory.
It's slightly more annoying to do this in (first-order) Peano arithmetic because Peano arithmetic is not capable of directly quantifying over functions $\mathbb{N} \to \mathbb{N}$. It can still be done but it requires a way to code finite strings of natural numbers as natural numbers; there are various ways to do this (e.g. Godel did it using prime factorizations, so a natural number $n = \prod p_i^{e_i}$ was used to code the finite string $(e_1, e_2, \dots)$) but they're all kind of annoying. One of them is to use binary representations!