Just like most mathematical theorems, you can formalize Godel's Theorems in some first order language (with some "standard" interpretation under which the formalization means what it's supposed to mean), turn the proof into a purely syntactic string of formulas, and figure out which formulas in that first order language are needed as axioms. I'm embarrassed to say I don't know exactly how strong the assumptions we need are to carry out the proof of Godel's Theorems, but there will be some weak fragment of ZFC probably not much stronger than PA which will do. So we would be using a theory slightly stronger than PA to establish the incompleteness of PA, but why should that be a problem?
The axioms needed for the proofs of Godel's Theorems are probably pretty natural, probably pretty close to PA, and probably have a natural interpretation. If you believe these axioms have this interpretation, then you would have no problem with Godel's proofs or the interpretation of the theorems. If not, then you're probably pretty close to believing PA is inconsistent, in which case you would probably:
- Accept that the formalized versions of Godel's Theorems follow from whatever axioms are used, but only because you believe those axioms are inconsistent.
- Deny that the formalized versions of Godel's Theorems mean what they're supposed to mean, and just regard what's happening in point 1 as a valid string of symbolic manipulations.
- Accept the natural language meaning of Godel's Theorems, in spite of point 2, for trivial reasons, since they say, "if PA is consistent, ..."
EDIT: (in response to your edit) So we're assuming PA proves FTA, PA is consistent, and FTA might be false? What do you mean by "false," you mean false in the standard intepretation? In that case, PA would be false in the standard interpretation. Now if we take Godel's first theorem to say, "If PA is consistent, then there is a true formula in the standard interpretation which is not provable from PA," then this meta-theorem is certainly true.
EDIT: Ignas requested an explanation of some of the basics to make sense of my claim, "If PA proves FTA, and FTA fails in the standard model, then so does PA." It's too big to fit in a comment so I'm adding it to my response:
Let $\mathcal{L}$ denote the first order language of number theory, we'll have lower case Greek letters vary over sentences of $\mathcal{L}$, upper case Greek letters vary over sets of sentences of $\mathcal{L}$, and upper case Roman letters vary over $\mathcal{L}$ structures. We write $M \vDash \varphi$ to denote that $\varphi$ is true in the model $M$, i.e. when its symbols are interpreted according to $M$. Tarski's definition of truth for a sentence in a given model is by recursion on the complexity of the sentence. We write $M \vDash \Sigma$ if every member of $\Sigma$ is true in $M$. We write $\Sigma \vDash \varphi$ if for every $M$, $M \vDash \Sigma$ implies $M \vDash \varphi$, i.e. if every model of $\Sigma$ is also a model of $\varphi$.
For provability, we write $\Sigma \vdash \varphi$ to say that there is a (finite) proof of $\varphi$ using (finitely many) sentences from $\Sigma$ as axioms.
The Soundness Theorem states that for all $\Sigma ,\ \varphi$, if $\Sigma \vdash \varphi$ then $\Sigma \vDash \varphi$. It's this theorem, with PA in place of $\Sigma$ and FTA in place of $\varphi$, that I'm using to establish the claim you're asking about. The converse of this theorem is also true; it's Godel's Completeness Theorem. Putting these two theorems together, they say that the relations $\vdash$ and $\vDash$ are the same relation between sets of sentences and sentences. One (perhaps not immediately obvious) way to rephrase this is, "being true in every model is the same as being provable from no axioms." Contrast this Godel's Incompletness Theorem, which says that "being true in the standard model of number theory is not the same as being provable from PA."
If $\kappa$ is weakly compact and there is a wellorder of $V_{\kappa+1}$ definable over $V_{\kappa+1}$ without parameters, then second order logic is Loraxian for $V_\kappa$: the least branch through a definable tree is definable. In $L$ (or in fact any of the known canonical inner models), there is such a wellorder.
You didn't ask, but I think it is consistent for second order logic to fail to be Loraxian. After a preparatory forcing, one can add a Cohen subset $G$ of $\kappa$ without destroying its weak compactness. The forcing can be factored as adding a Suslin tree $T$ and then adding $G$ as a branch through $T$. I am not an expert here, but I think one can probably force over $V[G]$ to make $T$ definable over $V_{\kappa+1}$, arranging that $G$ remains undefinable and $\kappa$ remains weakly compact. Probably one can arrange that any subset of $V_\kappa$ definable over $V_{\kappa+1}^{V[G,H]}$ is in fact be definable over $V_{\kappa+1}^{V[T]}$ from $T$. This is an (admittedly sketchy) local version of the arguments from Cheng-Friedman-Hamkins's "Large cardinals need not be large in HOD."
Best Answer
Let $S$ be a first order definable Martin-Löf random set such as Chaitin's $\Omega$. If Peano Arithmetic, or ZFC, or any other theory with a computable set of axioms, proves infinitely many facts of the form $n\in S$ or $n\not\in S$ then it follows that $S$ is not immune or not co-immune and hence not ML-random after all.
(The set of theorems of our theory of the form $n\in S$ (or $n\not\in S$) is computably enumerable and infinite, hence has an infinite computable subset. Being immune means having no infinite computable subset.)
So only finitely many such facts can be proved. Now using an effective bijection between $\mathbb N$ and $\mathbb N\times \mathbb N$, decompose $S$ into infinitely many "columns", $S=S_0\oplus S_1\oplus\cdots$. Then one of these columns has the required property.
The theory should be strong enough to deal effectively with breaking a definable set up into columns and associating values in a column with values in the original set, but this is certainly doable in PA or ZFC.