See the article of Andrés Raggio, DIRECT CONSISTENCY PROOF OF GENTZEN'S SYSTEM OF
NATURAL DEDUCTION (Notre Dame Journal of Formal Logic, 1964).
The proof "by inspection" does not work in Natural Deduction : we have for example the $\lor$-elimination rule that allows us to "introduce" a new formula $C$ that is not a subformula of the major premisse $A \lor B$.
The same for $\lor$-introduction, where we pass from $A$ to $A \lor B$.
We need induction on the derivation-tree.
Added March, 5
Now I've got it !
See the detailed study of Gentzen's unpublished manuscript thesis of 1932 in Jan von Plato, Gentzen's Proof Systems: byproducts in a work of genius (The Bull of Sumb Log, 2012), page 329.
In Gentzen's thesis there is a conjecture about the normalization theorem for derivation in intuitionistic natural deduction, then transformed into a proof.
In Gentzen's original presentation of natural deduction, finished formal derivations ended with the closing of all open assumptions. [...] With the closing of all open assumptions in the end, the conclusion became the longest formula in a derivation that also contained all the formulas in the derivation as subformulas, if no detours had been taken [emphasis mine].
"A closer inspection of the natural deduction calculus led me in the end to a general theorem that I want to call the 'Hauptsatz'. [...] The natural calculus proved not to be suitable."
Then, Gentzen "moved to" sequent calculus, but von Plato in his paper shows that he was able to find a proof (unpublished) also for intuitionistic natural deduction.
In conclusion, the conjecture that, if we do not have $\bot$ in the axioms (assumptions) the system is consistent, is not in general true; in order to assert it, we must be able to "avoid detours". The result must be attained in natural deduction through the normalization theorem.
Note. In natural deduction the presence of $\supset$-E (modus ponens) plays the role of cut : from $A$ and $A \supset B$ we derive $B$; in this way, we lose the "trace" of $A$.
In a system based on sequent calculus, instead, the consistency proof "by inspection" works because the "no detours" condition is "built in" into the rules: no assumption is lost during the derivation.
So, if $\bot$ is not present into the axioms, we may conclude that it will not be present in the conclusion.
The implication
$$
[((P\rightarrow Q)\rightarrow P)\rightarrow P ] \rightarrow P\lor\neg P \quad\quad\quad\quad (1)
$$
is in fact not a theorem of intuitionistic propositional logic. You can check this semantically, with a Kripke frame with two worlds $w_1\le w_2$, with $P, Q$ both satisfied in $w_2$ and none of them satisfied in $w_1$. See here for background.
Or you could try to prove (1) in a sequent calculus and discover that you get stuck.
What is true is that if you make Peirce's law an additional axiom, then $P\lor\neg P$ becomes a theorem. In symbols:
$$
\{((A\rightarrow B)\rightarrow A )\rightarrow A\} \vdash P\lor\neg P
$$
Here the curly brackets are supposed to refer to the collection of all substitution instances.
This is discussed in the answer you linked. (So Peirce's law and LEM are equivalent in the sense that adding either of them will give you classical logic.)
By the deduction theorem, you then also obtain an implication similar to (1) as a theorem, but if you check what (substitution) instance of Peirce's was actually used, you'll find that $A=P\lor\neg P$, $B=\bot$. So
$$
\vdash [((P\lor\neg P\rightarrow \bot)\rightarrow P\lor\neg P )\rightarrow P\lor\neg P] \rightarrow P\lor\neg P ,
$$
which of course is not the same as (1).
As for your last question, taken in an abstract sense, it is definitely possible to have "$\vdash A$ implies $\vdash B$" satisfied, but $A\rightarrow B$ is not a formal theorem: the hypothesis holds whenever $\not\vdash A$, and there is of course no reason why this would make $A\rightarrow B$ a theorem.
Best Answer
Per the BHK interpetation, a proof of $(P \lor (P \to \bot)) \to \bot$ is pair of procedures, one for each disjunct in the hypothesis.
The first, $s_1$, is a procedure which takes a proof of $P$ as input and produces a proof of $\bot$. In other words, the first is a proof of $P \to \bot$.
The second, $s_2$, takes any proof of $P \to \bot$ as input and produces a proof of $\bot$. In other words, it is a proof of $(P \to \bot) \to \bot$.
By composing these to form $s_2(s_1)$, we obtain a procedure which takes no input and produces a proof of $\bot$.
The procedure just described (composition) takes any proof of $(P \lor (P \to \bot)) \to \bot$ and produces a proof of $\bot$. This means that $((P \lor (P \to \bot))\to \bot) \to \bot$ is provable.
This description shows how to write a $\lambda$ expression of the appropriate type. All that it does is to break the input into its two components and run the second one with the first as input. In particular, if we view $(P \lor \lnot P) \to \bot$ as a type $p(n,s)$ such that $\lambda s.p(1,s)$ is of type $P \to \bot$ and $\lambda\,s.p(2,s)$ is of type $\lnot P \to \bot$ then $$\lambda\, p(n,s) . p(2,\lambda\, s.p(1,s))$$ is of type $((P \lor \lnot P) \to \bot) \to \bot$