Let KP0 be the induction-free fragment of KP. That is, the axioms of KP0 are extensionality, pairing, union, Δ0-separation, Δ0-collection, and (since I see no way of proving this from the other axioms) the existence of transitive closures.
Following Barwise, I define the class of Σ-formulas to be the smallest class which contains atomic formulas and their negations, and which is closed under conjunction, disjunction, bounded quantification, and unbounded existential quantification. Similarly, I define the of Π-formulas to be the smallest class which contains atomic formulas and their negations, and which is closed under conjunction, disjunction, bounded quantification, and unbounded universal quantification. The negation of a Σ-formula φ is canonically equivalent to a Π-formula and vice versa; I will identify ¬φ with its canonical equivalent without further comment.
The following three facts are standard in KP. The usual proofs actually go through in KP0.
Σ-Reflection. For every Σ-formula KP0 ⊦ φ ↔ ∃aφ(a), where φ(a) is obtained from φ by replacing every unbounded quantifier ∃x by the corresponding bounded quantifier ∃x∈a.
Σ-Collection. For every Σ-formula φ(u,v),
KP0 ⊦ ∀u∈x∃vφ(u,v) →∃y∀u∈x∃v∈yφ(u,v).
Δ-Separation. For every Σ-formula φ(u) and every Π-formula ψ(u),
KP0 ⊦ ∀u∈x(φ(u) ↔ ψ(u)) → ∃y∀u∈x(u ∈ y ↔ φ(u)),
where y does not occur free in φ nor ψ.
Instead of Mike Shulman's foundation axiom, I will use the classically equivalent regularity axiom:
(Reg) x ≠ ∅ → ∃y∈x(x∩y = ∅).
This axiom already implies a certain amount of induction because of Δ-separation.
Δ-Induction. For every Σ-formula φ(u) and every Π-formula ψ(u),
KP0 + Reg ⊦ ∀u(φ(u) ↔ ψ(u)) → (∀u(∀v∈uφ(v) → φ(u)) → ∀uφ(u)).
For convenience, I will now assume that the language contains a function symbol rk(x) for the ordinal rank of a set. Let A be a model of KP0 + Reg. A cut in A is a proper initial segment I of the ordinals of A which has no least upper bound; a Σ-cut (resp. Π-cut) is one which can be defined by a Σ-formula (resp. Π-formula).
Σ-Cut Lemma. If A is a model of KP0 + Reg and I is a Σ-cut in A, then AI = {x ∈ A : rk(x) ∈ I} is a model of KP0 + Reg.
Sketch of proof. The main point of contention is Δ0-collection. Suppose that φ(u,v) is a Δ0-formula such that AI ⊧ ∀u∈x∃vφ(u,v). Consider the set of ordinals
J = {α ∈ A : ∃u∈x∀v(rk(v) < α → ¬φ(u,v))}.
This is a Π-definable initial segment of I. We cannot have I = J,
since that would violate Δ-induction. Pick β ∈ I \ J. Working in A, we can find a set y such that
∀u∈x∃v∈y(rk(v) < β ∧ φ(u,v)).
If necessary, we can cut down y so that rk(y) ≤ β and hence y ∈ AI. ∎
If A has a minimal Σ-cut, then this gives a model of KP0 that satisfies Σ-induction. Unfortunately, there are models of KP0 + Reg without minimal Σ-cut.
Now, on the positive side a minor twist of the proof of the Σ-cut lemma gives the following.
Π-Cut Lemma. If A is a model of KP0 that satisfies Π-induction and I is any cut of A, then AI is a model of KP0.
(Π-induction implies that the Π set J defined as above must have a least upper bound β ∈ I. The rest of the proof is identical.)
So when A satisfies Π-induction, we can take a sufficiently small cut of A to get a model of KP (with full induction). In fact, we can take the maximal wellfounded initial segment of the ordinals of A, with the caveat that this cut is very often just ω.
When I have a chance, I'll try to add examples showing that you can't do better than the Σ-cut lemma and the Π-cut lemma.
Addendum. I just stumbled across two papers that address these questions. The first paper is by Domenico Zambella Foundation versus induction in Kripke-Platek set theory [J. Symb. Logic 63 (1998), MR1665739] where it is shown that foundation implies open-induction, but not Σ1-induction over KP0 (without transitive closures). This is complemented by Antonella Mancini and Domenico Zambella A Note on Recursive Models of Set Theories [Notre Dame J. Formal Logic 42 (2001), MR1993394] where they generalize Tennenbaum's Theorem by showing that fragments of KP which prove Σ1-induction have no computable models other than the standard one.
Take the Zermelo ordinals to be defined by $Z(0) = 0$, $Z(\alpha+1) = \{Z(\alpha)\}$, and $Z(\lambda) = \{Z(\alpha): \alpha<\lambda\}$ (where $\alpha, \lambda$ are von Neumann ordinals). Then if we add $Z(\omega+ \omega)$ to $V_{\omega +\omega}$ and close under pairing, union, subsets, and powersets, we get a model of $T$ - Choice + $A$ which is not a model of $A^*$.
More precisely, let $D_0 = V_{\omega+\omega} \cup\{Z(\omega+\omega)\}$ and $D_{n+1}$ be the result of adding pairs, unions, subsets, and powersets of element of $D_n$ to $D_n$. Clearly, $M = \bigcup_{n <\omega}D_n$ is transitive and models $T$ - Choice. A simple induction shows that:
For every $x\in D_n$ there is an $\alpha<\omega+\omega$ such that the (von Neumann) ordinals in $tc(x)$ are less than $\alpha$.
Since $V_{\omega+\omega}\subseteq M$, it follows that the von Neumann ordinals in $M$ are just those in $\omega+\omega$. So it models $A$. Since the rank of $Z(\omega+\omega)$ is $\omega+\omega$, it doesn't model $A^*$.
(To get Choice in the form ``every set is well-orderable" we just throw in $x \times x$ at $D_{n+1}$ for $x\in D_n$).
Best Answer
I asked the same question about the replacement axiom not long ago at the $n$-Category Café, and the answer I got back from Mike Shulman is that it's used for example in the transfinite construction of free algebras, which really refers to a body of connected results in category theory as described here. The essential use made of replacement is in the transfinite compositions; this also occurs in the small object argument.
Having said that, a part of me still wonders whether there aren't workarounds. In many cases an initial algebra of a functor is situated inside a terminal coalgebra of the same functor, and the construction of the latter often doesn't require transfinite compositions (this is the case, e.g., for polynomial endofunctors). Paul Taylor in his book Practical Foundations of Mathematics has a section on general recursion using a theory of well-founded coalgebras, which is manifestly meaningful in contexts where one does without replacement, such as ETCS, and I wonder to what extent this could be put to use to construct free algebras without resorting to replacement.