Nature of incompleteness
A computational system is said to be Turing complete if the system can be used to simulate an arbitrary Turing machine on an arbitrary input string. Since $\lambda^{\rightarrow}$ is strongly normalizing, every well-typed program is guaranteed to reduce to an irreducible normal form (which is to say, every program halts). As such, we couldn't possibly write a $\lambda^{\rightarrow}$ program to simulate a Turing machine on an input for which the machine never halts; all $\lambda^{\rightarrow}$ programs halt and such a Turing machine on such an input, trivially, does not.
Less formally, simulating arbitrary Turing machines on arbitrary inputs means you need some form of conditional choice (so as to emulate the tape head's ability to move along the tape), and some means of storing and manipulating an arbitrary number of memory locations (so as to emulate the tape). For $\lambda^{\rightarrow}$, typing rules are sufficient to encapsulate the notion of choice, but we can't operate over arbitrary amounts of memory because types can't recurse.
Counterexamples to typability
As far as things that can't be expressed using $\lambda^{\rightarrow}$, a classic example would be fixed point operators.
Consider for instance Y combinator, which is defined as $λf.(λx. f(xx))(λx. f(xx))$.
Notice that within $(xx)$, we cannot assign a unique type to both occurrences of $x$.
An answer to a similar question on the CSTheory site suggests a more clever example: the function which takes a computable function argument and returns its Gödel number.
In this case again, self-reference plays a central role and using a diagonal argument one can prove that a simply-typed term for such a function cannot exist.
You are right in your intuition that proving that a derivation does not exist is somewhat cumbersome. Soundness and completeness of formal proof systems provide us with an interesting duality:
$\vDash \phi \Leftrightarrow$ all models satisfy $\phi$ $\Leftrightarrow$ there exists no model that does not satisfy $\phi$
(soundness)
$\iff$
(completeness)
$\vdash \phi \Leftrightarrow$ there exists a derivation that proves $\phi$
$\not \vDash \phi \Leftrightarrow$ not all models satisfy $\phi$ $\Leftrightarrow$ there exists a model that does not satisfy $\phi$
(completeness)
$\iff$
(soundness)
$\not \vdash \phi \Leftrightarrow$ there exists no derivation that proves $\phi$ $\Leftrightarrow$ all potential derivations can not prove $\phi$
Proving a universally quantified statement ("all") is often an awkward thing to do because you need to make an argument that any entity from a possibly infinite domain (such as any structure or any possible derivation) has the desired property (such as (not) satisfying or (not) proving a formula), while proving an existentially quantified statement ("there is") is very easy, because it suffices to just provide one example (one counter-model/one proof) and you're done.
So if you are supposed to show $\vDash$, rather than making a possibly cumbersome semantic argument about validity in all models, soundness and completeness allow as to show $\vdash$, i.e. provide one syntactic derivation that proves the desired formula.
One the other hand, showing $\not \vdash$, i.e. arguing that there can be no derivation can be annoying, but soundness and completeness enable us to just show that $\not \vDash$ holds by providing one counter-model and you're done.
So the easier way to show that $\not \vdash t: ((\sigma \to \tau) \to \sigma) \to \sigma)$ would be to provide a semantic argument that shows that there are models of typed $\lambda$-calculus/intuitionistic logic in which the formula $((\sigma \to \tau) \to \sigma) \to \sigma)$ does not hold. A semantics for intuitionistic logic is e.g. Kripke semantics.
If you are, however, interested in syntactic proof, you need to make an argument that no syntactic derivation is possible, i.e. any attempt to prove the formula will eventually lead to a dead-end where it is not possible to continue the proof, and you need to reason why.
So let's attempt to build up a derivation tree:
The only reasonable way to start (bottom-up) is to first (un-)do an implication introduction to get rid of $((\sigma \to \tau) \to \sigma)$: Implication elimination would require an even larger major premise and a suitable minor premise$^1$, and introducing a new variable is not an option because the set of assumptions (the left-hand-side of the sequent) has to be empty in the conclusion formula.
The next reasonable step after $(\to I)$ is to make use of $x: ((\sigma \to \tau) \to \sigma)$ - which can in turn be derived by one application of the identity rule - to show $x(\lambda y.M): \sigma$ by an application of $(\to E)$, which requries a proof of the antecedent, $\lambda y.M: (\sigma \to \tau)$.
With a similar argument as in the first step, the only useful continutation for the latter node is another $(\to I)$, which adds $y: \sigma$ as a new assumption, and we now have to prove $M: \tau$. This is the point were we get stuck.
The first observation is that we can't use $(\to I)$ anymore because $\tau$ is not a function type.
Simply adding $\tau$ to the context, i.e. introducing a new variable $z$ which has the type $\tau$, and closing the branch with $(Id)$ is not an option either, because we would need to carry this additional assumption down the tree, and end up with the sequent $z: \tau \vdash \lambda x.(x(\lambda y.z)): ((\sigma \to \tau) \to \sigma) \to \sigma$, with a context containing $z: \tau$ instead of being empty. So we would not have a proof (= a derivation of the form $\vdash t: \alpha$, with no assumptions on the left-hand side of the sequent), but instead the derivability of the type would depend on the stipulation of some variable $z$ being declared to have the type $\tau$. But then our term would contain a free variable, which does not solve the type inhabitation problem: That for any type $\alpha$, we could find a trivial "inhabitation" of the form $x:\alpha \vdash x:\alpha$ is uninteresting - instead, type inhabitation asks about the existence of a closed term.
So alternatively, we could attempt to introduce new declarations of the form $... \to \tau$ to the context, and continue the tree by an application of elimination rules; however this would merely shift the problem upwards the tree:
If we chose $\ldots \vdash \sigma \to \tau$ as the major premise with $\ldots \vdash \sigma$ (derived by an application of the $(Id)$ rule) as the minor premise to the $(\to E)$ rule, an endless loop would occur, since a similar rule application is already used one step lower in the tree. The fact that the context now in addition contains $\sigma$ does not make a difference in this case.
If we chose $\vdash \tau \to \tau$ as the major premise to the $(\to E)$ rule, another endless loop would occur, since the other branch (the minor premise) would need to contain $\vdash \tau$ with the same context again.
And introducing new type variables different from $\sigma$ and $\tau$ wouldn't help either, but only leave us with even more to prove.
So there is no way to continue the upper right branch of the derivation tree, and at no point could we have made a smarter choice by selecting a different rule.
Since the tree can not be resolved, the type is not inhabited.
As you see, proving that a formula is not derivable can be rather cumbersome, because you need to make sure that every way of trying to build up a tree will eventually fail - I hope my proof was comprehensive enough in this respect.
Another thing you might find interesting w.r.t. to the correspondence to the $\{\to\}$-fragment of intuitionistic logic: The formula $(((((\sigma \to \tau) \to \sigma) \to \sigma) \to \tau) \to \tau)$, i.e. the above formula with two succedents $\tau$ appended, is deriavable/inhabited. This formula represents the double negation of Peirce's law (under the assumption that $\tau$ stands for $\bot$, and $\neg \neg \phi$ is seen as an abbreviation of $(\phi \to \bot) \to \bot$). This formula can be derived in intuitionistic logic, and in particular in the $\{\to\}$-fragment, i.e. in positive implication logic.$^2$ Thus, the type $(((((\sigma \to \tau) \to \sigma) \to \sigma) \to \tau) \to \tau)$ is inhabited (take $\alpha$ for $\sigma$ and $\beta$ for $\tau$):
An inhabitant (a $\lambda$-term) corresponding to this type (formula) would be $\lambda x.(x(\lambda y.(\lambda z.x(\lambda y.z)))$. You see that the part $\lambda x.(x(\lambda y.M))$ is the same as before, except that now $M$ can be resolved.
Some more words on Peirce from a proof-theoretic perspective:
A derivation of Peirce's law in classical logic requires an application of the reductio ad absurdum rule (or something equivalent, like double negation eliminiation): An assumption $\neg \alpha$ is made from which $\bot$ is derived, then $\alpha$ is concluded while discharging the assumption $\neg \alpha$. Again, you would need to argue why RAA is necessary, i.o.w., why there can be no derivation without this step.
RAA/DNE is part of/admissible in natural deduction for classical logic, but not intuitionistic logic: In IL, only ex falso quodlibet sequitur available, which allows to conclude anything from $\bot$ but does not allow to discharge assumptions for the form $\neg \alpha$. In fact, from a proof-theoretic perspective, the presence or absence of RAA/DNE is exactly what tells the two systems apart; the remaining set of rules is the same; so IL is basically CL with RAA/EFQL+DNE replaced by the weaker EFQL rule.
Since RAA/DNE is necesssary for the proof but this rule is not available in IL, Peirce is not derivable in ND (or any other system) for IL, and in turn not for positive implcation logic, the $\{\to\}$-fragment of IL.
$^1$ In a formal proof, major premise = the premise with the formula that contains the occurrence of the operator to be eliminated by the rule, minor premises = the other premises. In the $(\to E)$ rule, with a conclusion is of the form $\Gamma \vdash \beta$ and two premises, the major premise is the premise of the form $\Gamma \vdash \alpha \to \beta$, and the minor premise is the one of the form $\Gamma \vdash \alpha$.
$^2$ In fact, it can be shown that for any formula which is only valid classically but not in IL, its double negation can be derived in IL. Unlike in classical logic, in IL, $\neg \neg \phi$ is not equivalent to $\phi$: it is a weaker propositions, so $\phi$ implies $\neg \neg \phi$, but not necessarily the other way round.
Best Answer