There are many flavors of "meta" in logic. Most make very minimal use of the metatheory. For example, the Montague Reflection Principle in Set Theory says the following:
Metatheorem. For every formula $\phi(x)$ of the language of Set Theory, the following is provable: If $\phi(a)$ is true then there is an ordinal $\alpha$ such that $a \in V_\alpha$ and $V_\alpha \vDash \phi(a)$.
This is in fact an infinite collection of theorems, one for each formula $\phi(x)$. Gödel's theorem prevents ZFC from proving all of these instances at the same time. The proof of this metatheorem is by induction on the length of $\phi(x)$. The requirements on the metatheory are very minimal, all you need is enough combinatorics to talk about formulas, proofs, and enough induction to make sense of it all. This is by far the most common flavor of "meta" in logic. There really is not much to it. Most logicians don't worry about the metatheory in this context. Indeed, it is often irrelevant and, like most mathematicians, logicians usually believe in the natural numbers with full induction.
However, sometimes there is need for a much stronger metatheory. For example, the metatheory of Model Theory is usually taken to be ZFC. Gödel's Completeness Theorem is a nice contrasting example.
Completeness Theorem. Every consistent theory is satisfiable.
Recall that a theory is consistent if one cannot derive a contradiction form it, while a theory is satisfiable if it has a model. The consistency of a theory is a purely syntactic notion. When the theory is effective, consistency can be expressed in simple arithmetical terms in a manner similar to the one discussed above. However, satisfiability is not at all of this form since the models of a theory are typically infinite structures. This is why this theorem is usually stated in ZFC. For the Completeness Theorem, one can often get by with substantially less. For example, the system WKL0 of second-order arithmetic suffices to prove the Completeness Theorem for countable theories (that exist in the metaworld in question), but such a weak metatheory would be nothing more than a major inconvenience for model theorists.
Sometimes there are multiple choices of metatheory and no consensus on which is most appropriate. This happens in the case of Forcing in Set Theory, which has three common points of view:
Forcing is a way to extend a countable transitive model of ZFC' to a larger model of ZFC' with different properties. Here ZFC' is a finite approximation to ZFC, which makes this statement non-vacuous in ZFC because of the Reflection Principle above.
Forcing is a way to talk about truth in an alternate universe of sets. Here, the alternate universe is often taken to be a Boolean-valued model, formalized within the original universe.
Forcing is an effective way to transform a contradiction some type of extensions of ZFC into contradictions within ZFC itself. Here, ZFC could be replaced by an extension of ZFC.
The last point of view, which is the least common, is essentially arithmetical since it only talks about proofs. The second point of view is well suited for hard-core platonists who believe in one true universe of sets. The first point of view, which is probably most popular, essentially takes ZFC as a metatheory much like model theorists do.
Since there is no consensus, set theorists will often talk as if both $V$ and its forcing extension $V[G]$ are absolutely real universes of sets. Although this point of view is hard to justify formally, it is possible to makes sense of it using any one of the three viewpoints above and it has the advantage that it makes it easier to express ideas that go into forcing constructions, which is what set theorists really want to talk about.
Finally, the idea of analyzing which systems prove the consistency of other systems is very common in logic. In Set Theory, these systems often take the form of large cardinal axioms. In Second-Order Arithmetic these systems often take the form of comprehension principles. In First-Order Arithmetic these systems often take the form of induction principles. Together these form an incredibly long consistency strength hierarchy stretching from extremely weak basic arithmetical facts to incredibly deep large cardinal axioms. A very significant part of logic deals with studying this hierarchy and, as Andrej Bauer commented, logicians are usually very aware of where they are sitting in this hierarchy when proving metatheorems.
Apart from the storm of comments, let me just try to answer the question.
There are several ways in a which a mathematical theorem
can be contingent.
First, the independence phenomenon in set theory shows the striking ubiquity of
contingency in mathematics. For example, the Continuum Hypothesis is true
is some set-theoretic
worlds and false in others (and we can control it
exactly). There are hundreds of other examples of
statements with the same independence status---they are
true in some worlds and false in others. The method of
forcing has been used to spectacular effect in proving
many of these independence results.
The Incompleteness phenomenon of Goedel can be used to show that
whether a statement is provable or not from a given
axiom system (in classical logic) can be contingent.
Specifically, the Incompleteness theorem says that no
theory T, if consistent, can prove its own consistency.
Thus, if ZFC is consistent, then there are models of ZFC
in which ZFC is thought to be inconsistent. In such a
model, ZFC is thought to prove any statement at all! But
in our world, not all these statements will be theorems.
Thus, in this sense, even the question of whether a given statement is a theorem or not
can be contingent.
The large cardinal hierarchy in set theory provides
numerous examples of statements transcending the
consistency strength of weaker statements. If large cardinals are consistent, then there are some set-theoretical universes in which ZFC proves that there are no inaccessible cardinals and other universes where it does not.
There are also several ways in which contingency is ruled
out.
First, one of the most important properties of a proof
system is soundness,
which means that any statement provable in the system from
true hypotheses will remain true. Of course, this is an
expected feature of any proof system worthy of the name. A
theorem is a statement having a proof in such a system. Once we have adopted a
given proof system that is sound, and the axioms are all
necessarily true, then the theorems will also all be necessarily
true. In this sense, there can be no contingent theorems.
Second, one of the profound achievements of Goedel was
his Completeness theorem, which states that any
statement that holds in all models of a given first order
theory T, actually has a proof from the theory. For
example, every statement in the language of group theory
that happens to be true in all groups, actually has a
finite proof from the group axioms (using any of several
proof systems). This is far from obvious, and I find it profound.
But it answers a dual version of a question you might
have asked, which I find interesting, namely:
Is every necessary truth a theorem?
The answer is Yes, and this is just what the Completeness
theorem expresses.
These last two points together explain that if one takes the possible worlds to be all models of a given theory, then the necessary truths are precisely the theorems of that theory.
Best Answer
It seems to me that the mathematical equivalent of the intensional vs. extensional distinction in philosophy would be the distinction between "formal" vs. "functional" objects: formal power series vs. convergent power series, formal integration by parts (with no regard for checking the validity of the operation in a real analysis sense) vs. rigorous integration by parts, formal polynomials vs. functions which happen to be represented by a polynomial, etc. If so, I would say that the formal vs. functional distinction is usually dealt with in more advanced classes, though usually not at the first-year undergraduate level.
For instance, in algebra, the concept of an indeterminate variable (and its distinction from the set-theoretic notion of a variable in a fixed domain) tends to be sufficient for keeping the two concepts distinct in most situations involving set-theoretic functions and the formal expressions giving rise to those functions. In particular, polynomials can be formal by living in some polynomial ring $R[x]$ generated by an indeterminate $x$, rather than having to be set-theoretic functions on some domain. Algebraic geometry also takes particular care in distinguishing an ideal of polynomials from the set-theoretic locus that that ideal cuts out over a given field, or more generally by distinguishing a scheme from a variety.
Similarly, real analysis, with all its cautionary counterexamples as to how various formal operations (e.g. exchanging limits or sums) can lead to disaster if the appropriate functional hypotheses are not verified, also tends to be pretty good about distinguishing a formal computation from a functional one; often the former is used as an initial heuristic motivation only, with the latter then being brought in for the rigorous proof. Although certainly mistakes have been made by treating a formal computation as if it were functionally valid...
Related to this is the ubiquitous "abuse of notation" in which a package of objects, structures, and forms is referred to via its most prominent component (i.e. by synecdoche). Thus, for instance, one often sees a polynomial function $P: {\bf R} \to {\bf R}$ being used to simultaneously represent both the polynomial function and the formal polynomial that represents it, or vice versa (e.g. "the polynomial $x^2$" to refer to the function $x \mapsto x^2$). Another common instance of this is when dealing with spaces (sets with additional structure); one often abuses notation by using the set itself to denote the space, e.g. a group might be denoted by its set $G$ of elements, rather than by the tuple $(G, e, \cdot, ()^{-1})$ of group structures, or a set-theoretic function by just the mapping $f$, rather than than the triplet $(f,X,Y)$ that includes the domain and codomain of that mapping. Such abuses are technically illegal using the strictest interpretations of mathematical notation, but they save a lot of space and, when used correctly, allow readers to focus on the actual content of an argument rather than on its formalism. Still, it is useful and important to point these abuses out explicitly from time to time...