Suppose you are given the premisses
$$(P1)\quad\quad p\quad$$
and
$$(P2)\quad p \to q.$$
(It doesn't matter for what follows whether we are working here in mathematician's English, augmented with an arrow to abbreviate 'if ..., then ..' or working in a more fully formalized language.)
To derive the obvious conclusion from these two premisses we need a principle of inference, a permissive rule that says
From $A$ and $A \to B$, you can infer $B$.
Without accepting some such a rule, we can't get anywhere. If we do accept the rule, then from the we can then from our two premisses we can infer the conclusion
$$(C)\quad q.\quad$$
That displayed inference rule is of course the Modus Ponens rule.
And as was pointed out long ago, most famously by Lewis Carroll in 1895 (in his article `What the Tortoise Said to Achilles'), you can't replace the rule by a sentence or proposition such as
$$(P3) \quad (p \wedge (p \to q)) \to q.$$
to serve as a third premiss. For if we just accept this as a new premiss, we will still need a permissive rule to allow us to get anywhere, e.g. the rule
From $A$ and $A \to B$ and $(A \wedge (A \to B)) \to B$, you can infer $B$.
Can we avoid appeal to that new rule by instead accepting the proposition
$$(P4)\quad[(p \wedge (p \to q) \wedge (p \wedge (p \to q)) \to q] \to q?$$
as a new premiss. Of course not. To get to $q$ we'd need to invoke yet another rule! So we really, really, don't want to start down this regress!
For more discussion, see e.g. http://en.wikipedia.org/wiki/What_the_Tortoise_Said_to_Achilles
Moral: we can't replace the modus ponens rule by a sentence such as $(P3)$. Of course, $(P3)$ is true, and the rule and the truth are intimately connected. But even if both are available 'modus ponens' is -- by very long tradition -- the name for the rule, not for the related true sentence.
Best Answer
First of all, let me set the terminology straight:
By a syntactic proof ($\vdash$) we mean a proof purely operating on a set of rules that manipulate strings of symbols, without talking about semantic notations such as assignment, truth, model or interpretation. A syntactic proof system is one that says, for example, "If you have $A$ written on one line and $B$ on another line, then you are allowed to write the symbols $A \land B$ on a line below that." Examples of syntactic proof systems are Hilbert-style calculi, sequent calculi and natural deduction in their various flavors or Beth tableaus aka truth trees.
By a semantic proof ($\vDash$) we mean a proof operating on the semantic notions of the language, such as assignment, truth, model or interpretation. Examples of semantic proofs are truth tables, presentation of (counter) models or arguments in text (along the lines of "Suppose $A \to B$ is true. Then there is an assignment such that...").
Furthermore, the term "entailment" is usually understood as a purely semantic notion ($\vDash$), while the syntactic counterpart ($\vdash$) is normally referred to as derivability.
(The division "$\vDash$ = semantics/models and $\vdash$ = syntax/proofs" is oversimplifying matters a bit -- proof theoretic semantics, for example, argues that a semantics can be established in terms of formal (= "syntactic") proofs rather than just by model-theoretic considerations, but for the sake of this explanation, let's keep this more simple two-fold distinction up.)
I'm clarifying this because the way you set out things is not entirely accurate:
In a way yes, the syntax of a logic is always relevant when talking about notions such as entailment or derivability -- but what is the crucial feature that makes us call this notion syntactic? It isn't that the syntax of the language is involved in establishing entailment or derivability relations. The crucial feature is that the set of rules we use is purely syntactic, i.e. merely operating on strings of symbols, without making explicit reference to meaning.
That's not quite accurate -- in order to establish the truth value of a formula and hence notions such as validity or entailment, we have to investigate the syntax of a formula in order to determine any truth value at all. After all, truth is defined inductively on the structure (= the syntax) of formulas: "$[[A \land B]]_v = \text{true iff} [[A]]_v = \text{true and } [[B]]_v = \text{true}...$" If we didn't care about syntax, then we couldn't talk about semantics at all.
Now to your actual question:
The short answer is: Because syntactic proofs are often a lot easier.
For propositional logic, the world is still relatively innocent: We can just write down a truth table, look at the truth value at each formula and decide whether or not it is the case that all the lines where the columns of all of the premises have a "true" also have the conclusion column as "true". As you point out, this procedure quickly explodes for formulas with many propositional variables, but it's still a deterministic procedure that's doable in finite time.
We could also present a natural language proof arguing in terms of truth assignments. This can be a bit more cumbersome, but might be more instructive, and is still relatively handleable for the relatively simple language and interpretation of propositional logic.
But things get worse when we go into first-order logic. Here we are confronted with formulas that quantify over models whose domains are potentially infinite. Even worse, in contrast to propositional logic where the number of assignments is (~= interpretations) always finite and completely determined by the number of propositional variables, the structures (~= interpretations) in which a first-order formula might or might not be true are unlimited in size and shape. That is, not only can the structures themselves be infinite, but we are now facing an infinite amount of structures in which our formulas can be interpreted in the first place. Simply writing down a truth table will no longer work for the language of predicate logic, so determining the truth value -- and hence semantic properties and relations such as validity and entailment -- is no longer a simple deterministic procedure for predicate logic.
So if we want to present a semantic proof, we have to revert to arguments about the structures that a formula does or does not satisfy. This is where an interesting duality enters:
To prove that
it suffices to provide one (counter)model and we're done: If we find just one structure in which $\phi$ is true then we know that $\phi$ is satisfiable, and conversely, if we find one structure in which $\phi$ is not true then we know that $\phi$ is not valid.
Analogously, to prove that
it suffices to find one element in the structure's domain that provides an example for the existentially quantified formula or, respectively, a counterexample to the universal quantification and we're done.
However,
to prove that
we are suddenly faced with the difficult task of making a claim about all possible structures. We can not simply list them, as there are infinitely many of them, so all we can do is write a natural-language text arguing over the possible truth values of formulas eventually showing that all structures must succeed or fail to fulfill a certain requirement.
Analogously, to prove that
we have to iterate over all the elements in the structure's domain. If the domain is finite, we are lucky and can simply go through all of the possible values (although this may take quite some time if the domain is large enough), but if it is infinite, there is no way we can ever get done if we just brute force check the formula for the elements one after the other.
This is a rather unpleasant situation, and exactly the point where syntactic proofs come in very handy.
Recall the definition of entailment:
or equivalently
This is precisely the kind of universal quantification that makes purely semantic proofs difficult: We would have to establish a proof over the infinity of all the possible structures to check whether the semantic entailment relation does or does not hold.
But now look at the definition of syntactic derivability:
The nasty universal quantifier suddenly became an existential one! Rather than having to argue over all the possible structures, it now suffices to show just one syntactic derivation and we're done. (The same applies to the case where we don't have any premises, i.e. $\vDash \phi$ ("$\phi$ is valid" = "true in all structures" vs. $\vdash \phi$ (= "$\phi$ is derivable" = "there is a derivation with no open assumptions and $\phi$ as the conclusion). This is an enormous advantage -- call it "superior" if you want.
Now we have a kind of disparity: For some things semantics is hard whereas syntax is easy, so how can we use this disparity for good?
Luckily, in the case of classical logic, we are equipped with soundness and completeness:
While any reasonable derivation system will be sound w.r.t. to the language's semantics, completeness is a non-trivial and very powerful result: If we want to prove a semantic entailment, by completeness we know that there must be a syntactic derivation, so we can go find just one such derivation, and as soon as we do, soundness ensures us that this is indeed a proof that the entailment holds semantically. Hence, we can use syntactic proofs to avoid cumbersome semantic arguments that involve meta-logical quantification over all structures. This is pretty neat.
Now note how things turn around for the syntactic calculus:
we would have to argue over all possible syntactic proofs, which can again be difficult.
Now we can apply the soundness and completeness results in the other direction: If we want to show that a formula is not derivable, then by contraposition on completeness we know it is not valid (because if it were, then by completeness there would be a derivation), so we can carry out a semantic proof by providing just one counter-model to the validity of $\phi$ and we're almost done; because then again by contraposition on soundness, we can be sure that if the formula is not valid, there will be no derivation (because if there were a derivation for something that is not semantically valid, our system would be unsound), so we have our proof of the underivability without needing to argue over hypothetical derivations that can't exist.
And this is precisely how the aforementioned duality comes about:
Thanks to soundness and completeness, the duality of semantic and a syntactic proofs can help us bridge the difficult parts:
Putting these bridges into the picture from before:
I think the existence of syntactic calculi already is wonderful enough simply for the mathematical beauty of this symmetry.