An axiom schema is an infinite set of axioms, all of which have a similar form.
For example, in Peano Arithmetic, there is the axiom schema of induction; for each first order predicate $\phi$ in the language of PA, we have an axiom:
$$(\phi(0) \wedge \forall n (\phi(n) \rightarrow \phi(n+1))) \rightarrow \forall n \phi(n)$$
For different formulas $\phi$, we get completely separate axioms. For example, we can replace $\phi$ with the formula $n+1+1 = n+2$ or with the formula $\forall m (((\exists l) n + l = m) \vee ((\exists l) n = m + l))$ or whatever other formula we can write down.
However, each of these axioms has the same "shape" - they're given by substituting some formula in for $\phi$. This is not a finite set of axioms, and you can show that there can be no finite set of axioms that works, but if we were to allow a symbol $\forall^* \phi$ which quantified over all formulas $\phi$ (which is basically what second-order logic is), we would be able to write this down as a single axiom.
There are indeed many presentations of classical propositional logic1.
If complexity was measured purely in terms of number of axioms, then yes, Mendelson's axiom system would be more "economical". We could be even more "economical" with Meredith's system: $$((((A\to B)\to(\neg C\to\neg D))\to C)\to E)\to((E\to A)\to(D\to A))$$ Just rolls right off the tongue. While minimality is often a driver, people do still need to discover more minimal systems. (Indeed, the above is not the most minimal system if we include the complexity of the axiom and not just the number.) There's also the question of accepting the axioms. Ideally, we want the axioms to be "self-evident" or at least easy to understand intuitively. Maybe it's just me, but Meredith's axiom does not leap out at me as something that should obviously be true, let alone sufficient to prove all other classical tautologies.
Minimality is, however, not the "whole point" of axiomatic systems. You mention another reason: sometimes you actually do want to prove things at which point it is better to have a richer and more intuitive axiomatic system. You may argue that we can just derive any theorems we want to use from some minimal basis, and then forget about that basis. This is true but unnecessary complexity if we have no other reason for considering this minimal basis. When we compare different styles of proof system (e.g. Hilbert v. Sequent Calculus v. Natural Deduction), translations between them (especially into Hilbert-style systems) can involve a lot of mechanical complexity. That complexity can sometimes be significantly reduced by a careful choice of axioms.
For the Laws of the Excluded Middle (LEM) and Non-contradiction, the first thing you'd need to do is define the connectives. You can't prove $\neg(P\land\neg P)$ in a system that doesn't have $\land$. Given $\neg$ and $\to$ as primitives, standard definitions of $\land$ and $\lor$ are $P\land Q:\equiv \neg(P\to\neg Q)$ and $P\lor Q:\equiv\neg P\to Q$. With these definitions (or others), then yes, the LEM and Non-contradiction can both be proven in the systems you mention and any other proof system for classical propositional logic. Your concern here is an illustration that we often care about which axioms we have and not just that they're short and effective.
This also leads to another reason why we might want a certain presentation. We may want that presentation to line up with other, related logics. As you're starting to realize, it is ill-defined to say something like "intuitionistic propositional logic (IPL) is classical propositional logic (CPL) minus the LEM". When people say things like this, they are being sloppy. However, "CPL is IPL plus LEM" is unambiguous. Any presentation of IPL to which we add LEM is a presentation of CPL. For that presentation, it makes sense to talk about removing LEM. It doesn't make sense to talk about removing an axiom without a presentation of axioms that contains that axiom. It is also quite possible to have a presentation of CPL containing LEM that becomes much weaker than IPL when LEM is removed. In fact, you'd expect this because a presentation of IPL with LEM added is likely to be redundant because things which are intuitionistically distinct become identifiable when LEM is added. The story is the same for paraconsistent logics.
While it isn't as much of a driver for Hilbert-style proof systems, for natural deduction and sequent calculi the concerns of structural proof theory often push for more axioms (or rules of inference, rather). For example, many axioms in a Hilbert-style proof system mix together connectives, e.g. as Meredith's does above. This means we can't understand a connective on its own terms, but only by how it interacts with other connectives. A driving force in typical structural presentations is to characterize connectives by rules that don't reference any other connectives. This better reveals the "true" nature of the connectives, and it makes the system more modular. It becomes meaningful to talk about adding or removing a single connective allowing you to build a logic à la carte. In fact, structural proof theory motivates many constraints on what a proof system should look like such as logical harmony.
1 And that link only considers Hilbert-style proof systems.
Best Answer
There are specific deduction systems that have only a finite number of logical axioms, or even no logical axioms. One example is natural deduction, which trades away logical axioms for a more complicated set of inference rules.
However, the use of infinite axiom schemes of logical axioms is not a problem for many of the applications of finite axiomatizability. If we have a structure $M$, or are constructing it, and we want to tell whether $M$ satisfies a theory $T$, there are two main issues we face:
For finitely axiomatizable theories, the second question is not an issue, so we can focus just on the first one. In any case, we don't need to worry whether our structure $M$ satisfies the logical axioms - we know it does, because they are true in every structure. We only need to worry about whether the structure satisfies the additional axioms of $T$. This is why finite axiomatizability is defined in terms of the non-logical axioms only.
If there are only finitely many new axioms in $T$, we can put all these axioms into a single sentence $\phi$ so that an arbitrary structure satisfies $T$ if and only if it satisfies $\phi$. That is a nice situation to be in, and it allows us in some cases to prove more about finitely axiomatizable theories than we would be able to prove about non-finitely-axiomatizable ones.
The existence of the single sentence $\phi$ also means there is a sort of "absoluteness" to whether a model satisfies $T$. If $T$ has an infinite axiom scheme, then from the point of view of a nonstandard model of arithmetic the scheme may well have nonstandard axioms - this is a subtle issue that can cause various kinds of confusion when people work with Gödel's incompleteness theorem. For a finitely axiomatized theory, because we can work with the single sentence $\phi$, even nonstandard models of arithmetic agree about the axioms for $T$.