How can a formalist understand a metalogical proof such as the completeness theorem? These proofs exist exclusively outside of a logical system where the "rules of the game" are undefined. These sorts of metalogical theorems attest to the validity of the logic a formalist must hold dear, thus it seems problematic they are unjustifiable within this perspective.
Formalist understanding of metalogical proofs
foundationslogicphilosophy
Related Solutions
Perhaps we should distinguish two questions:
(1) Does $PM$ fail as a philosophical/foundational project: if so, why?
(2) Does $PM$ in some sense fail as a mathematical project: if so, why?
Re (1): It is usually said that the philosophical project of PM was to defend a version of logicism, by showing that all arithmetical truths, all truths of classical analysis, and more, can be deduced from purely logical assumptions plus definitions (where the definitions cash out notions like "is a natural number" in purely logical terms). In other words, Russell and Whitehead are trying to complete Frege's logicist project (but this time, in a consistent logical framework!).
$PM$ fails, read as attempting this philosophical project. For a start, its Axiom of Infinity and Axiom of Reducibility are very difficult to defend as logical axioms. And then -- a fatal blow -- Gödel's first incompleteness theorem shows that the ambition of capture even all arithmetical truths in a single (decidably) axiomatized system must fail. [Aside: ZFC isn't propounded with the same all-encompassing logicist ambitions as $PM$, so isn't challenged in this way by Gödelian incompleteness.]
Re (2): The ramified type theory of PM is mathematically messy to work with in various ways. Not just inconvenient, but messy in a way that suggests that it is not implementing its leading idea in the mathematically best way. And indeed as Ramsey pointed out long ago, the mathematical project of $PM$ doesn't need some of the complications (the complications which require the Axiom of Reducibility and which seem to be there for more philosophical reasons -- i.e. to give a one-size-fits-all treatment of a whole range of paradoxes, logical and semantic). So let's separate out one key mathematical idea, and we get a near descendant of $PM$, preserving Russell's key idea, namely (Simple) Type Theory -- which is alive and well and not a failure at all!
Gentzen [1] kick-started the field of structural proof theory: the 1934 paper introduces natural deduction. In the paper he already noted that introduction rules essentially represent definitions of the logical symbols. Later on, Gentzen took this perspective much further in his work on the sequent calculus, where he associates to each proof of a given sentence a corresponding canonical (cut-free) proof whose last rule has predictable form, depending only on the principal connective or quantifier of the formula.
Cut-elimination allows us to prove that intuitionistic logic satisfies structural properties (what you'd call metatheoretic properties, many of them tellingly named after the connectives) such as the disjunction property (if there is a proof of $A \vee B$, there ought to be a proof of $A$ or a proof of $B$), the conjunction property (if there is a proof of $A \wedge B$, there ought to be a proof of $A$ and a proof of $B$) and consistency (there ought to be no proof of $\bot$), and corresponding to the BHK interpretation of the connectives. In Gentzen's sequent calculus, we can prove all of them the same way: by considering the last rule of a canonical proof, which has to be a right rule for the connective. E.g. to prove the conjunction property, we can consider a proof of $\vdash A \wedge B$: by cut-elimination, we can turn this proof into a canonical proof, which must then have the last rule
$$\frac{\vdash A\ \ \ \ \vdash B}{\vdash A \wedge B} \wedge\!R$$
so we indeed have proofs of $\vdash A$ and $\vdash B$ as desired. Similarly, we have consistency: if $\vdash \bot$ had a proof, it would have a cut-free proof, but it obviously has no cut-free proof (what would be its last rule?!). The rules enunciate the BHK-style desiderata: cut-elimination ensures that we indeed get the desiderata. One can play a very similar game in natural deduction, using Prawitz's [2] proof normalization technique (indeed the right rules of the former correspond to introduction rules of the latter, and left rules similarly correspond to elimination rules).
This should give you some idea about the desiderata corresponding to the rules of natural deduction, and the proofs that establish that each of these are indeed met. For a more thorough appraisal and philosophical discussion, check out Francez's book on Proof-theoretic Semantics [3], especially his discussions of inferentialism, and of Prior's tonk connective. If you're more interested in the technical details of normalization, Parwitz [2] or Chapter 4 of Dummett [4] are good starting points as well.
[1] Gerhard Gentzen, "Untersuchungen über das logische Schließen. I". Mathematische Zeitschrift 39 (2), 1934.
[2] Dag Prawitz, "Natural Deduction: A Proof-Theoretical Study", Dover Publications 1965.
[3] Nissim Francez, "Proof-theoretic Semantics", Studies in Logic 57, College Publications UK, 2015.
[4] Michael Dummett, "Elements of intuitionism", Oxford Logic Guides 39, Oxford University Press, 1977.
Best Answer
Method A. By making the metatheory explicit. The completeness theorem is provable in ZFC, so if $\Phi$ is a statement of the completeness theorem in the language of ZFC then a formalist/finitist still recognizes the formal derivation that shows $\text{ZFC} \vdash \Phi$.
This method may not be very satisfying, because it doesn't help a finitist prove anything else. The next method does that.
Method B. Through conservation results. There is a system $\text{PRA}$ of arithmetic that is usually identified with finitistic reasoning. There is also a subsystem $\text{WKL}_0$ of second-order arithmetic with the following properties:
$\text{WKL}_0$ proves the completeness theorem for countable, syntactically consistent first-order theories (every countable first-order syntactically consistent theory has a model).
There is a finitistic proof that any $\Pi^0_2$ sentence of arithmetic that is provable in $\text{WKL}_0$ is provable in the finitistic system $\text{PRA}$. This kind of metatheorem is known as a `conservation result'.
So a formalist can work in $\text{WKL}_0$, and use the completeness theorem for countable theories, while knowing all the while that any $\Pi^0_2$ sentence that she proves is already finitistically provable. Because statements viewed as meaningful in finitism are going to be at worst $\Pi^0_2$, this means that any meaningful statement she proves from $\text{WKL}_0$ can be proved finitistically, and hence can be "accepted".
This method is sometimes known as 'reductionism'. The idea is that someone can use an "ideal" theory (e.g. $\text{WKL}_0$) to prove a wide variety of statements, knowing that some subclass of these statements (e.g. the $\Pi^0_2$ ones) that are proved in this way are already provable in a weaker "concrete" system that the person prefers.