The first point is to distinguish between internal and external properties. This is exacerbated by the fact that we're looking specifically at $\mathsf{ZFC}$, which is "doing double duty" in the most confusing way possible.
The short answer to your question, though, is: "$3$."
First, the internal side of things. This is relevant to the end of your question only.
Almost always, when we say "Property X is not first-order expressible" what we mean is "There is no first-order sentence $\varphi$ such that for every appropriate structure $\mathcal{M}$, we have $\mathcal{M}\models\varphi$ iff $\mathcal{M}$ has property X." So, for example, being a torsion group is not first-order expressible.
In particular, "Pointwise definability is not first-order expressible" is a consequence of the following perhaps-simpler result:
Every (infinite) pointwise-definable structure is elementarily equivalent to a non-pointwise-definable structure.
The above statement is made and proved within $\mathsf{ZFC}$. The "nuke" is upwards Lowenheim-Skolem:
$\mathsf{ZFC}$ proves "If $\mathcal{M}$ is a pointwise-definable structure, then $\mathcal{M}$ is countable."
- Wait, what? See the very end of this answer.
$\mathsf{ZFC}$ also proves "Every infinite structure $\mathcal{M}$ is elementarily equivalent to an infinite structure of strictly greater cardinality."
Putting these together, we have the desired result.
As a corollary, we have the following (again proved in $\mathsf{ZFC}$):
For every first-order theory $T$, either $T$ has no pointwise-definable models at all or the class of pointwise-definable models of $T$ is not an elementary class.
(We need the first clause in case the relevant class is $\emptyset$. This can indeed happen, even if $T$ is consistent: consider the theory of a pure set with two elements.)
But the bulk of the issue in your OP is about the external side of things. Here it's your third option which holds, as follows:
We state and prove each relevant fact $\mathsf{A}$ inside $\mathsf{ZFC}$.
... Except that sometimes as a matter of practice, we're sloppy - and either (the two options are equivalent) actually use a stronger system $\mathsf{ZFC+X}$ or prove $\mathsf{X}\rightarrow\mathsf{A}$ for some "unspoken but clear from context" (:P) additional principle $\mathsf{X}$. Standard candidates for $\mathsf{X}$ include the standard "generalized consistency" principles ("$\mathsf{ZFC}$ has a model/$\omega$-model/transitive model") and - far less benignly, but unfortunately with nonzero frequency - the whole slew of large cardinal axioms.
However, it seems to me that the HLR paper is actually pretty good on this point. For example, the first bulletpoint of Theorem $3$ is "If $\mathsf{ZFC}$ is consistent, then there are continuum many non-isomorphic pointwise definable models of $\mathsf{ZFC}$," which is indeed a $\mathsf{ZFC}$-theorem. (I could be missing an elision somewhere else, though.)
As a coda, note that above I mentioned that $\mathsf{ZFC}$ proves that every pointwise-definable structure is countable. It does this, moreover, by exactly the "math tea argument." So what gives?
Well, we have to unpack what "Every pointwise-definable structure is countable" means when we phrase it in $\mathsf{ZFC}$. When we say "$\mathcal{M}$ is pointwise-definable," what we mean is that there is an appropriate assignment of truth values to pairs consisting of formulas of the language and tuples of appropriate arity such that [stuff]. This blob of data exists "one level higher" than $\mathcal{M}$ itself, and in particular even the bit of this blob verifying that each element of $\mathcal{M}$ satisfies "$x=x$" is a collection of $\mathcal{M}$-many facts. As such:
Using the "all-at-once" definition of $\models$, which is totally fine for set-sized structures, we have $\mathsf{ZFC}\vdash$ "$V\not\models \forall x(x=x)$."
Hehehehe.
This is because the expression "$V\not\models\forall x(x=x)$," if we try to construe it directly as above, is shorthand for: "There is a function with domain $V\times Formulas(\{\in\})$ such that ...," and that's dead on arrival since there aren't any functions with domain anything like $V$ in the first place.
So actually $\mathsf{ZFC}$ does prove "$V$ is not pointwise definable" - as long as we formulate this blindly. But if we do that, then we have to admit that $\mathsf{ZFC}$ also proves e.g. "There is no sentence which $V$ satisfies." Which ... yeah.
Incidentally, the above should make you worry about a couple things:
Relatively benignly, is the "all-at-once" definition of $\models$ actually appropriate for set-sized structures? As a matter of fact it is, but this is not entirely trivial. Specifically, the $\mathsf{ZFC}$ axioms are strong enough to perform the recursive construction of the theory of a structure, and so prove that for each (set-sized) structure $\mathcal{M}$ there is exactly one relation between formulas and tuples from $\mathcal{M}$ satisfying the desired properties. Weaker theories don't need to be this nice: while any non-totally-stupid theory can prove that at most one "theory-like thing" exists for a given structure, if we go weak enough we lose the ability to carry out Tarski's "algorithm." (Fortunately, we do in fact have to go quite weak; see my answer here.)
More foundationally, why are we so blithe about how we formulate various mathematical claims in the language of set theory? Of course this isn't at all new, and in particular the above observation that in a precise sense $\mathsf{ZFC}$ proves "$V$ doesn't satisfy $\forall x(x=x)$" is just another example of a junk theorem. However, to my mind it's one of the more worrying ones: unlike e.g. "Is $\pi\in 42$?," it's not totally clear that "Does $V\models \forall x(x=x)$?" is something we'd never accidentally ask in day-to-day mathematics. Ultimately I'm still not worried, but I do think that this highlights the seriousness of the question "Is $X$ a faithful translation of $Y$?"
Finally, on a purely technical level: what about set theories which do allow for functions on the universe, and with respect to which therefore the "internal naive $V\models ...$"-situation is nontrivial? Well, per Tarski/Godel(/etc.) we know that things have to still be nasty. See the end of this old answer of mine for a quick analysis of the specific case of $\mathsf{NF}$.
If you are using ZFC as a foundational theory, you typically assume that all of mathematics happens inside a fixed "standard" model $V$ of ZFC. Whether you consider that to be literally true or just a convenient metaphor or something else is really a question about philosophy more than mathematics.
Regarding your second question, in the context of ZFC as a foundation (that is, given that you take for granted some fixed set-theoretic universe $V$ in the background), $\mathbf R$ has a precise meaning (well, precise up to the choice of a specific construction you choose; however, the results of these constructions are all definably (in V) isomorphic).
However, the fact that ZFC proves that $\mathbf R$ satisfies the RCF axioms is not really dependent on that $V$. What happens is that the field of real numbers is definable in ZF. More precisely, there is a formula $\rho(x)$ in the language of set theory (given by the construction!) which in the "standard model" of set theory defines exactly the field of real numbers.
Then ZF proves that there is a unique set satisfying $\rho(x)$ and given an axiom $\varphi$ of RCF, we can write a formula $\varphi'(x)$ in the language of set theory such which expresses $x\models\varphi$, and for each such axiom, it is a theorem of ZF that $\rho(x)\rightarrow \varphi'(x)$ (in fact, ZF proves much more than the first-order properties: it also proves that $\rho(x)$ implies that $x$ is complete, which defines it up to isomorphism).
Thus, $\mathbf R$ is also a notion that can be interpreted in any model of ZF (via $\rho$), and it will have the same first-order properties in each case.
Best Answer
An interpretation of one theory $T$ in another theory $S$ is essentially an appropriate tuple of formulas $\Phi$ (one formula for each symbol in the language of $S$, plus another formula for "domain") such that in any model $M$ of $T$, we have "$\Phi^T\models S$" - that is, $\Phi$ defines in $M$ a set and some relations/functions on that set which together form a model of $S$.
For example, consider the usual construction of $\mathbb{C}$ as ordered pairs of reals. This corresponds to an interpretation of $Th(\mathbb{C};+,\cdot)$ in $Th(\mathbb{R};+,\cdot)$; the formulas in question are
$\varphi_{domain}(x,y)\equiv \top$ (the domain is all ordered pairs),
$\varphi_+(x_1,x_2,y_1,y_2,z_1,z_2)\equiv (x_1+y_1=z_1)\wedge (x_2+y_2=z_2)$ (addition is componentwise), and
$\varphi_\cdot(x_1,x_2,y_1,y_2,z_1,z_2)\equiv (x_1\cdot y_1-x_2\cdot y_2=z_1)\wedge(x_1\cdot y_2+x_2\cdot y_1=z_2)$ (multiplication treats the second coordinate like $i$ - note that I'm using subtraction here, which I shouldn't really, but since it's definable in terms of $+$ this is a benign abbreviation).
(A precise definition can be found in either of Hodges' model theory books, and this blog post of Hamkins - which I also cite below for a technical result - gives a good description of the notion.)
For a more technical example, forcing gives a method for constructing interpretations: e.g. the proof that ZFC+$\neg$CH is consistent if ZFC is ultimately amounts to constructing an interpretation $\Phi$ of ZFC+$\neg$CH in ZFC (with a bit of thought taken to handle non-well-founded models).
Two theories are then inter-interpretable (or mutually interpretable) if each is interpretable in the other. Interestingly there is a sharper version, called bi-interpretability, which requires that the interpretations commute nicely: given $M\models T$, we need "($M$'s model of $S$)'s model of $T$" to be isomorphic to $M$ (and similarly with $S$ and $T$ switched).
(In fact we require even more than that even: we need such isomorphisms to be uniformly definable. But this is a bit technical so I'll ignore it for now.)
This is in general a stronger condition than mere mutual interpretability; see e.g. here to see a stark difference.