A crucial difference between non-Euclidean geometries and "non-inductive" models of PA- is that any model of PA- contains a canonical copy of the true natural numbers, and in this copy of $N$, the induction schema is true. In other words, PA is part of the complete theory of a very canonical model of PA-, and as such it seems much more natural (so to speak) to study fragments of PA rather than extensions of PA- which contradict induction.
To make this a little more precise (and sketch a proof), the axioms of PA- say that any model $M$ has a unique member $0_M$ which is not the successor of anything, and that the successor function $S_M: M \to M$ is injective; so by letting $k_M$ (for any $k \in N$) be the $k$-th successor of $0_M$, the set $\{k_M : k \in N\}$ forms a submodel of $M$ which is isomorphic to the usual natural numbers, $N$. Any extra elements of $M$ not lying in this submodel lie in various "Z-chains," that is, infinite orbits of the model $M$'s successor function $S_M$.
(In the language of categories: the "usual natural numbers" are an initial object in the category of all models of PA-, where morphisms are injective homomorphisms in the sense of model theory.)
So, while PA seems natural, I'm not sure why there would be any more motivation to study PA- plus "non-induction" than there is to study any of the other countless consistent theories you could cook up, unless you find that one of these non-inductive extensions of PA- has a particularly nice class of models.
My example is the classical proof that sqrt(2) is irrational.
More generally, many proofs that proceed by showing that there are no minimal counterexamples exemplify your phenomenon. The method of no-minimal-counterexamples is exactly the same as strong induction, but where one proves the required implication by contradiction. In many applications of this method, it is often clear that the smallest numbers are not counterexamples, and this would not ordinarily regarded as a separate base "case".
In the classical proof that sqrt(2) is irrational, for example, we suppose sqrt(2) = p/q, where p is minimal. Now, square both sides and proceed with the usual argument, to arrive at a smaller counterexample. Contradiction! This amounts to a proof by strong induction that no rational number squares to 2, and there seems to be no separate base case here.
People often carry out the classical argument by assuming p/q is in lowest terms, but the argument I just described does not need this extra complication. Also, in any case, the proof that every rational number can be put into lowest terms is itself another instance of the phenomenon. Namely, if p/q is a counterexample with p minimal, then divide by any common factor and apply induction. There seems to be no separate base case here where it is already in lowest terms, since we were considering a minimal counterexample. Perhaps someone objects that there is no induction here at all, since one can just divide by the gcd(p,q). But the usual proof that any two numbers have a gcd is, of course, also inductive: considering the least linear combination xq+yp amounts to strong induction, again with no separate base case.
Best Answer
The terms "weak induction" and "strong induction" are not commonly used in the study of logic. The terms are commonly used only in books aimed at teaching students how to write proofs.
Here are their prototypical symbolic forms:
weak induction: $(\Phi(0) \land (\forall n) [ \Phi(n) \to \Phi(n+1)]) \to (\forall n) \Phi(n)$
strong induction: $(\Psi(0) \land (\forall n) [ (\forall m \leq n) \Psi(m) \to (\forall m \leq n+1) \Psi(m)]) \to (\forall n) \Psi(n)$
The thing to notice is that "strong" induction is almost exactly weak induction with $\Phi(n)$ taken to be $(\forall m \leq n)\Psi(n)$. In particular, strong induction is not actually stronger, it's just a special case of weak induction modulo some trivialities like replacing $\Psi(0)$ with $(\forall m \leq 0 )\Psi(m)$. Of course you can write variations of the symbolic forms, but the same point applies to all of them: "strong" induction is essentially just weak induction whose induction hypothesis has a bounded universal quantifier.
So the question is not why we still have "weak" induction - it's why we still have "strong" induction when this is not actually any stronger.
My opinion is that the reason this distinction remains is that it serves a pedagogical purpose. The first proofs by induction that we teach are usually things like $\forall n \left [ \sum_{i=0}^n i = n(n+1)/2 \right ]$. The proofs of these naturally suggest "weak" induction, which students learn as a pattern to mimic.
Later, we teach more difficult proofs where that pattern no longer works. To give a name to the difference, we call the new pattern "strong induction" so that we can distinguish between the methods when presenting a proof in lecture. Then we can tell a student "try using strong induction", which is more helpful than just "try using induction".
In terms of logical strength in formal arithmetic, as you can see above, the two forms are equivalent over some weak base theory as long as you are looking at induction for a class of formulas that is closed under bounded universal number quantification. In particular, all the syntactic classes of the analytical and arithmetical hierarchies have that property, so weak induction for $\Sigma^0_k$ formulas is the same as strong induction for $\Sigma^0_k$ formulas, weak induction for $\Pi^1_k$ formulas is the same as strong induction for $\Pi^1_k$ formulas, and so on. This equivalence will hold under any reasonable formalization of "strong" induction - I chose mine above to make the issue particularly obvious.
Addendum I was asked in a comment why $$ (1)\colon (\forall t)[(\forall m < t)\Phi(m) \to \Phi(t)] $$ implies $$ (2)\colon (\forall n)[(\forall m \leq n)\Phi(m) \to (\forall m \leq n+1) \Phi(m)]. $$ I'm going to give a relatively formal proof to show how it goes. The proof is not by induction, instead it just uses universal generalization to prove the universally quantified statements.
For the proof, I will assume (1) and prove (2). Working towards that goal, I fix a value of $n$ and assume: $$ (3)\colon (\forall m \leq n)\Phi(m). $$
I first want to prove $(\forall m < n+1)\Phi(m)$, which is an abbreviation for $(\forall m)[m < n+1 \to \Phi(m)]$. Pick an $m$. If $m < n+1$, then $m \leq n$, so I know $\Phi(m)$ by assumption (3). So, by universal generalization, I obtain $(\forall m < n+1)\Phi(m)$.
Next, note that a substitution instance of (1) gives $(\forall m < n+1)\Phi(m) \to \Phi(n+1)$. I have proved $(\forall m < n+1)\Phi(m)$ so I can assert $\Phi(n+1)$.
So now I have assumed $(\forall m \leq n)\Phi(m)$ and I have also proved $\Phi(n+1)$. Another proof by cases establishes $(\forall m \leq n+1)\Phi(m)$.
By examining the proof, you can see which axioms I need in my weak base theory. I need at least the following two axioms:
$(m \leq t) \leftrightarrow (m < t) \lor (m=t)$
$(m < t+1) \to (m \leq t)$
I believe those are the only two axioms I used in the proof.