I think an important answer is still not present so I am going to type it. This is somewhat standard knowledge in the field of foundations but is not always adequately described in lower level texts.
When we formalize the syntax of formal systems, we often talk about the set of formulas. But this is just a way of speaking; there is no ontological commitment to "sets" as in ZFC. What is really going on is an "inductive definition". To understand this you have to temporarily forget about ZFC and just think about strings that are written on paper.
The inductive definition of a "propositional formula" might say that the set of formulas is the smallest class of strings such that:
Every variable letter is a formula (presumably we have already defined a set of variable letters).
If $A$ is a formula, so is $\lnot (A)$. Note: this is a string with 3 more symbols than $A$.
If $A$ and $B$ are formulas, so is $(A \land B)$. Note this adds 3 more symbols to the ones in $A$ and $B$.
This definition can certainly be read as a definition in ZFC. But it can also be read in a different way. The definition can be used to generate a completely effective procedure that a human can carry out to tell whether an arbitrary string is a formula (a proof along these lines, which constructs a parsing procedure and proves its validity, is in Enderton's logic textbook).
In this way, we can understand inductive definitions in a completely effective way without any recourse to set theory. When someone says "Let $A$ be a formula" they mean to consider the situation in which I have in front of me a string written on a piece of paper, which my parsing algorithm says is a correct formula. I can perform that algorithm without any knowledge of "sets" or ZFC.
Another important example is "formal proofs". Again, I can treat these simply as strings to be manipulated, and I have a parsing algorithm that can tell whether a given string is a formal proof. The various syntactic metatheorems of first-order logic are also effective. For example the deduction theorem gives a direct algorithm to convert one sort of proof into another sort of proof. The algorithmic nature of these metatheorems is not always emphasized in lower-level texts - but for example it is very important in contexts like automated theorem proving.
So if you examine a logic textbook, you will see that all the syntactic aspects of basic first order logic are given by inductive definitions, and the algorithms given to manipulate them are completely effective. Authors usually do not dwell on this, both because it is completely standard and because they do not want to overwhelm the reader at first. So the convention is to write definitions "as if" they are definitions in set theory, and allow the readers who know what's going on to read the definitions as formal inductive definitions instead. When read as inductive definitions, these definitions would make sense even to the fringe of mathematicians who don't think that any infinite sets exist but who are willing to study algorithms that manipulate individual finite strings.
Here are two more examples of the syntactic algorithms implicit in certain theorems:
Gödel's incompleteness theorem actually gives an effective algorithm that can convert any PA-proof of Con(PA) into a PA-proof of $0=1$. So, under the assumption there is no proof of the latter kind, there is no proof of the former kind.
The method of forcing in ZFC actually gives an effective algorithm that can turn any proof of $0=1$ from the assumptions of ZFC and the continuum hypothesis into a proof of $0=1$ from ZFC alone. Again, this gives a relative consistency result.
Results like the previous two bullets are often called "finitary relative consistency proofs". Here "finitary" should be read to mean "providing an effective algorithm to manipulate strings of symbols".
This viewpoint helps explain where weak theories of arithmetic such as PRA enter into the study of foundations. Suppose we want to ask "what axioms are required to prove that the algorithms we have constructed will do what they are supposed to do?". It turns out that very weak theories of arithmetic are able to prove that these symbolic manipulations work correctly. PRA is a particular theory of arithmetic that is on one hand very weak (from the point of view of stronger theories like PA or ZFC) but at the same time is able to prove that (formalized versions of) the syntactic algorithms work correctly, and which is often used for this purpose.
About intuition.
Aristotle introduced a notion of consequence along the following lines. Suppose $A$ and $B$ are sentences, each of them either true or false. We study the relation :
(1) 'If A is true, then B has to be true too'.
Many logicians reckoned that the main use of logic is derive conclusions from premises, with the obvious goal that if we start from true premises, and we use “logical rules”, we will derive true conclusions. This idea of logic gives rise to the relation :
(2) 'From A, we can prove B'.
Now we try to translate all this into modern terms.
(1) translates into the relation of logical consequence $A \vDash B$ :
'For every interpretation $\mathcal I$, if A is true under $\mathcal I$, then B is true under $\mathcal I$.'
We “generalize” it putting in place of the sentence A a set $\Gamma$ of sentences :
$\Gamma \vDash \varphi$
which means that :
'For every interpretation $\mathcal I$, if all of $\Gamma$ are true under $\mathcal I$, then $\varphi$ is true under $\mathcal I$.'
Lastly, we have the “special acse” when $\Gamma = \emptyset$ : $\vDash \varphi$ means that the sentence $\varphi$ is valid (in propositional logic it is a tautology) when it is true under every interpretation.
In this case, we call it also logical (or universally) valid sentence, meaning that it is true “under all possible circumstances”.
For (2) [see Neil Tennant, Natural logic (1978), page 5], we have that, in order to demonstrate the validity of an argument one needs a proof.
A proof of an argument is a sequence of steps starting from its premisses, taken as starting points, to its conclusion as end. Within a proof each step, or 'inferences', must be obviously valid.
A system of proof is a codification of these obviously valid kinds of inference, which we call rules of inference. A proof in accordance with these rules must, in order to meet the demands of certainty, satisfy the following conditions :
(i) It must be of finite size.
(ii) Every one of its steps must be effectively checkable.
A system of this type is a logical calculus (i.e.proof systems) formalizing the relation of derivability :
$A \vdash B$
which is :
'B is derivable using only the logical rules of the calculus, starting from A as premise’.
We “generalize” it putting in place of the sentence A a set $\Gamma$ of sentences :
$\Gamma \vdash \varphi$.
Lastly, we have the “special case” when $\Gamma = \emptyset$ : $\vdash \varphi$ means that the sentence $\varphi$ is theorem of the calculus (it is provable) when it is derivable from no premises.
Obviously there must be no invalid proofs in our system : the system must be sound.
Moreover we wish to be able to prove any valid argument expressible in the language. The system, that is, must be complete (or adequate).
Soundness : if $\Gamma \vdash \varphi$, then $\Gamma \vDash \varphi$ (with the particular case : if $\vdash \varphi$, then $\vDash \varphi$)
Completeness : if $\Gamma \vDash \varphi$, then $\Gamma \vdash \varphi$ (with the particular case : if $\vDash \varphi$, then $\vdash \varphi$).
Best Answer
Not exactly. Consider the following example : an English book about Greek grammar is about a language : Greek, and it is written in English, the metalanguage. But it is not logic nor about logic.
But, yes,
See S.C.Kleene, Introduction to metamathematics (1952), page 62 :
We have that metamathematics is the mathematical discipline that studies the properties of those specific mathematical objects that are the formalized theories.
When the formalized theory is a logical calculus, like e.g. propositional calculus or predicate calculus, the metamathematical discipline studying them can be quite obviously called : metalogic.
This is consistent with what G.Hunter asserts in the Preface (p.xi) :
Considering specific examples, we have that :
(1) $p \to q$ is a formula of propositional calculus, i.e. an expression of the object language, written using the symbols of the language, like $\to$.
$\vDash (p \to p)$ is a statement in the metalanguage, asserting that formula $(p \to p)$ is valid.
The symbol $\vDash$ is not part of the calculus, i.e. is not part of the object language. It is a metalinguistic expression.
Regarding (2): exactly; a mathematical logic textbook is "made of" very few parts written in the object language : basically, only the initial examples of formulas and derivations, aimed at showing "how the calculus works".
The main part of the book is "metalogic" : it is metalogical the description of the syntax of the formal system. It is metalogical the proof of theorems like the Deduction Theorem and the Soundness and Completeness Theorems, i.e. the mathematical theorems expressing the properties of the formal system.
In conclusion, regarding (3): yes, a mathematical logic book is a metalogic book.
Compare with a textbook named "Physics": it is a book stating the physical theory, i.e. the theory regarding the facts of the physical realm.