I'm able to prove it "independently" from the Deduction Theorem, but the proof is quite longer ...
The axioms are :
$F \rightarrow (G \rightarrow F)$
$(F \rightarrow (G \rightarrow H))\rightarrow ((F \rightarrow G) \rightarrow (F \rightarrow H))$
$(\neg G \rightarrow \neg F) \rightarrow ((\neg G \rightarrow F) \rightarrow G)$
For readibility, I'll organize the proof with some preliminary results :
T1 : $P \rightarrow P$
1) $P \rightarrow ((Q \rightarrow P) \rightarrow P)$ --- Ax.1
2) $P \rightarrow (Q \rightarrow P)$ --- Ax.1
3) $(1) \rightarrow ((2) \rightarrow (P \rightarrow P))$ --- Ax.2
4) $P \rightarrow P$ --- from 3), 1) and 2) by Modus Ponens twice.
T2 : $(Q \rightarrow R) \rightarrow ((P \rightarrow Q) \rightarrow (P \rightarrow R))$
1) $(P \rightarrow (Q \rightarrow R)) \rightarrow ((P \rightarrow Q) \rightarrow (P \rightarrow R))$ --- Ax.2
2) $(1) \rightarrow ((Q \rightarrow R) \rightarrow (1))$ --- Ax.1
3) $(Q \rightarrow R) \rightarrow (1)$ --- from 1) and 2) by Modus Ponens
4) $(Q \rightarrow R) \rightarrow (P \rightarrow (Q \rightarrow R))$ --- Ax.1
5) $(3) \rightarrow ((4) \rightarrow ((Q \rightarrow R) \rightarrow ((P \rightarrow Q) \rightarrow (P \rightarrow R))))$ --- Ax.2
6) $(Q \rightarrow R) \rightarrow ((P \rightarrow Q) \rightarrow (P \rightarrow R))$ --- from 5), 3) and 4) by MP twice.
T3 : $P \rightarrow ((P \rightarrow Q) \rightarrow Q)$
1) $(P \rightarrow Q) \rightarrow (P \rightarrow Q)$ --- T1
2) $(1) \rightarrow (((P \rightarrow Q) \rightarrow P) \rightarrow ((P \rightarrow Q) \rightarrow Q))$ --- Ax.2
3) $((P \rightarrow Q) \rightarrow P) \rightarrow ((P \rightarrow Q) \rightarrow Q)$ --- from 1) and 2) by MP
4) $P \rightarrow ((P \rightarrow Q) \rightarrow P)$ --- Ax.1
5) $(3) \rightarrow ((4) \rightarrow (P \rightarrow ((P \rightarrow Q) \rightarrow Q)))$ --- T2
6) $P \rightarrow ((P \rightarrow Q) \rightarrow Q)$ --- from 5), 3) and 4) by MP twice.
T4 : $(P \rightarrow (Q \rightarrow R)) \rightarrow (Q \rightarrow (P \rightarrow R))$
1) $((P \rightarrow Q) \rightarrow (P \rightarrow R)) \rightarrow ((Q \rightarrow (P \rightarrow Q)) \rightarrow (Q \rightarrow (P \rightarrow R)))$ --- T2
2) $(P \rightarrow (Q \rightarrow R)) \rightarrow ((P \rightarrow Q) \rightarrow (P \rightarrow R))$ --- Ax.2
3) $Q \rightarrow (P \rightarrow Q)$ --- Ax.1
4) $(1) \rightarrow ((2) \rightarrow ((P \rightarrow (Q \rightarrow R)) \rightarrow ((3) \rightarrow (Q \rightarrow (P \rightarrow R)))))$ --- T2
5) $(P \rightarrow (Q \rightarrow R)) \rightarrow ((3) \rightarrow (Q \rightarrow (P \rightarrow R)))$ --- from 4), 1) and 2) by MP twice
6) $(3) \rightarrow ((3) \rightarrow (Q \rightarrow (P \rightarrow R))) \rightarrow (Q \rightarrow (P \rightarrow R))$ --- T3
7) $((3) \rightarrow (Q \rightarrow (P \rightarrow R))) \rightarrow (Q \rightarrow (P \rightarrow R))$ --- from 6) and 3) by MP
8) $(7) \rightarrow ((5) \rightarrow ((P \rightarrow (Q \rightarrow R)) \rightarrow (Q \rightarrow (P \rightarrow R))))$ --- T2
9) $(P \rightarrow (Q \rightarrow R)) \rightarrow (Q \rightarrow (P \rightarrow R))$ --- from 8), 7) and 5) by MP twice.
Now for the proof :
1) $(\neg C \rightarrow \neg B) \rightarrow ((\neg C \rightarrow B) \rightarrow C)$ --- Ax.3
2) $B \rightarrow (\neg C \rightarrow B)$ --- Ax.1
3) $(\neg C \rightarrow B) \rightarrow ((\neg C \rightarrow \neg B) \rightarrow C)$ --- from 1) and T4 by Modus Ponens
4) $B \rightarrow ((\neg C \rightarrow \neg B) \rightarrow C)$ --- from T2, 3) and 2) by MP twice
5) $(\neg C \rightarrow \neg B) \rightarrow (B \rightarrow C)$ --- from T4 and 4) by MP
The statement is known as Peirce's Law, and the proof is pretty nasty. I can believe someone can spend $10$ years on it without cracking it!
The proof uses some helpful Lemma's.
First, let's prove: $\phi \to \psi, \psi \to \chi, \phi \vdash \chi$:
$\phi \to \psi$ Premise
$\psi \to \chi$ Premise
$\phi$ Premise
$\psi$ MP 1,3
$\chi$ MP 2,4
By the Deduction Theorem, this gives us Hypothetical Syllogism (HS): $\phi \to \psi, \psi \to \chi \vdash \phi \to \chi$
Now let's prove the general principle that $\neg \phi \vdash (\phi \to \psi)$:
$\neg \phi$ Premise
$\neg \phi \to (\neg \psi \to \neg \phi)$ Axiom1
$\neg \psi \to \neg \phi$ MP 1,2
$(\neg \psi \to \neg \phi) \to (\phi \to \psi)$ Axiom2
$\phi \to \psi$ MP 3,4
With the Deduction Theorem, this means $\vdash \neg \phi \to (\phi \to \psi)$ (Duns Scotus Law)
Let's use Duns Scotus to show that $\neg \phi \to \phi \vdash \phi$ (Law of Clavius):
$\neg \phi \to \phi$ Premise
$\neg \phi \to (\phi \to \neg (\neg \phi \to \phi))$ (Duns Scotus Law)
$(\neg \phi \to (\phi \to \neg (\neg \phi \to \phi))) \to ((\neg \phi \to \phi) \to (\neg \phi \to \neg (\neg \phi \to \phi)))$ Axiom3
$(\neg \phi \to \phi) \to (\neg \phi \to \neg (\neg \phi \to \phi))$ MP 2,3
$\neg \phi \to \neg (\neg \phi \to \phi)$ MP 1,4
$(\neg \phi \to \neg (\neg \phi \to \phi)) \to ((\neg \phi \to \phi) \to \phi)$ Axiom2
$(\neg \phi \to \phi) \to \phi$ MP 5,6
$\phi$ MP 1,7
Using Duns Scotus and the Law of Clavius, we can now show that $ (\phi \to \psi) \to \phi \vdash \phi$:
$(\phi \to \psi) \to \phi$ Premise
$\neg \phi \to (\phi \to \psi)$ Duns Scotus
$\neg \phi \to \phi$ HS 1,2
$\phi$ Law of Clavius 3
And so finally, by the Deduction Theorem, we have: $\vdash ((\phi \to \psi) \to \phi) \to \phi$
Best Answer
Your attempt of derivation is wrong. Indeed, in axiom $A5$, there is a side condition: the formula $B$ must not contain any free occurrences of the variable $x$ (this side condition should be reported in Mendelson's textbook), while in the formula you want to prove $B$ might contain free occurrences of $x$. Therefore, the third step of your derivation is wrong: you cannot justify the formula $∀x(B \to C) \to (B \to ∀x C)$ as an instance of axiom $A5$, because $x$ might occur free in $B$.
On the contrary, Mendelson's derivation is, of course, perfectly correct. In particular, it does not use any axiom with such restrictions.
By the way, note also that the use of the generalization rule is affected by a side condition: you can derive $\forall x B$ from $B$ provided that the variable $x$ does not occur free in any hypotheses of the derivation. Mendelson's derivation respects this side condition when it uses the generalization rule.
Side conditions are important, because they avoid to derive formulas that should not be provable.