I can answer this question positively modulo an assumption whose consistency strength I do not know. Roughly speaking, I need a $\beta$-model of $\mathsf{KP}(\mathcal{P})+$"there exists a Reinhardt cardinal". All I know is that the existence of a rank-into-rank cardinal (or an actual Reinhardt, since this construction can be done on the level of definable classes) is good enough. It's likely that somebody who knows more would be able to sharpen this result.
What I need precisely is that there exists a transitive set $M$ such that
- $M$ satisfies adjunction (i.e., for any sets $x,y \in M$, the set $x \cup \{y\}$ is in $M$),
- $M$ satisfies $\Delta_0$-comprehension,
- $M$ is a $\beta$-model (i.e., any internally well-founded set relation is externally well-founded),
- the ordinal rank functions on well-founded relations and on sets are definable in $M$,
- for every ordinal $\alpha \in M$, the set $V_\alpha^M$ of all sets in $M$ of rank less than $\alpha$ is an element of $M$, and
- there is a proper elementary self-embedding $f : M \prec M$ with critical point $\alpha$ (i.e., $f$ moves some rank up, and $\alpha$ is the smallest ordinal in $M$ such that $f(\alpha) > \alpha$).
Note that for any set $x \in M$, the Cartesian product $x \times x$ is an element of $M$. This is the standard argument that powerset and comprehension are enough to get the existence of Cartesian products. We need adjunction to make sure that Kuratowski ordered pairs exist and to ensure that the height of $M$ is a limit ordinal.
Given $f$, construct a sequence of structures $(N_i)_{i<\omega}$ by setting $N_0 = M$ and choosing each $N_{i+1}$ so that $(N_{i+1},N_i)\cong (M,f(M))$. Let $N = \bigcup_{i<\omega}N_i$. Clearly we have that $N \equiv M$.
$f$ induces an automorphism $j$ on $N$ defined by sending each element $a$ of $N_i\cong M$ to the element corresponding to $f^{-1}(a)$ in $N_{i+1}$. Now we have that $\alpha \in N$ is an ordinal rank with $j(\alpha) < \alpha$, so we can apply the Boffa construction of a model of $\mathsf{NFU}$, i.e., we build a structure $A$ whose underlying set is $V_\alpha^N$ with an element-of relation $x\in^\ast y$ which holds if and only if $j(x) \in y\wedge y \in V_{j(\alpha)+1}^{N}$.
$\Delta_0$-comprehension is all that we need to ensure that this structure is a model of $\mathsf{NFU}$. By a result of Holmes, $\in$ and the restriction of $j$ to $V_\alpha$ are definable in $A$. (All that this argument needs is that the set $E=\{(\{x\},x): x \in j(V_\alpha^N)\}$ is in $N$, which ensures that $j^3(E)$ is in $V_{j(\alpha)+1}^N$. This follows from the earlier remark regarding Cartesian products.)
Now we need to relate binary relation that $A$ thinks are well-founded to binary relations that $N$ thinks are well-founded. Let $r \in V_\alpha^N$ be a binary relation on some set $x\in V_\alpha^N$. Since $\in^\ast$ and $\in$ agree on the subset relation and on what Kuratowski ordered pairs are, we have that $A\models$"$r$ is well-founded" if and only if $N \models$"$r$ is well-founded". (Note though that $\in^\ast$ and $\in$ disagree about what the elements of $x$ are, but this doesn't actually matter.)
We will also need the following claims.
Claim 1. If $\gamma \in N$ is an (internal) ordinal such that for all $\delta \leq \gamma$, $j(\delta)=\delta$, then $(\gamma, <)$ is (externally) well-founded.
Proof. This follows immediately from the fact that any element of $N$ fixed by $j$ must be an element of $N_0\cong M$, which is well-founded. $\square_{\text{claim}}$
Claim 2. If $r \in N$ is a binary relation on some set $x \in N$, then $(x,r)$ is externally ill-founded if and only if either $N \models$"$r$ is ill-founded" or there is some $b \in x$ such that $N \models j(\mathrm{rk}(b)) < \mathrm{rk}(b)$.
Proof. If $N \models r\text{ is ill-founded}$, then clearly $(x,r)$ is externally ill-founded. If there is some $b \in x$ such that $N \models j(\mathrm{rk}(b)) < \mathrm{rk}(b)$, then $r$ has a rank function with values in an ill-founded ordinal. Since the theory of $M$ (and therefore also the theory of $N$) knows that the set of ranks occuring in $(x,r)$ is an initial segment of the ordinals, we have that $(x,r)$ is externally ill-founded.
Now suppose that $N \models r\text{ is well-founded}$ and for every $b \in x$, $j(\mathrm{rk}(b))=\mathrm{rk}(b)$. By Claim 1, we have that $(x,r)$ has a rank function with values in a well-founded ordinal, wherefore $(x,r)$ itself is well-founded. $\square_{\text{claim}}$
So for any $x,r \in A$ (i.e., in $V_\alpha^N$) with $r$ an externally ill-founded binary relation on $x$, either $A \models$"$r$ is ill-founded", in which case this is witnessed by some set in $A$, or we can consider the sub-class
$$\{y \in x : (\exists \gamma \leq \mathrm{rk}(y))j(\gamma) < \gamma\}$$
which contains no $r$-minimal element and so witnesses that $r$ is externally ill-founded. So we have that $A$ is a $\beta$ish-model of $\mathsf{NFU}$.
I don't know whether $A$ satisfies the stronger version of the $\beta$ish-model condition that you mentioned. At its face, it seems like it might at best have it for "$\in$-$\Delta_0$-definable" classes.
By strengthening the theory of $M$ you can strengthen the resulting theory of $A$. Since the automorphism $j$ can't touch $\omega$, the requirements I've written imply that $M$ satisfies the (von Neumann ordinal) axiom of infinity, so $A$ will also satisfy the (Frege cardinal) axiom of infinity. ($\mathsf{NFU}+\neg\mathrm{Inf}$ implies that every set is Dedekind finite. Whatever $\omega$ turns into will not be Dedekind finite. Amusingly, I can't actually see a way to construct a $\beta$ish-model of $\mathsf{NFU}+\neg\mathrm{Inf}$.) $A$ should also satisfy the axiom of counting (i.e., that the set of finite cardinals is strongly Cantorian).
As for $\mathsf{NF}$ itself, Holmes's (tentative) proof that $\mathsf{NF}$ is consistent is a compactness argument, and to me there's no clear way to modify it to control well-foundedness of parts of the resulting model. EDIT: See the discussion in the comments.
Best Answer
(Thank you, Alice, for getting me to look at this) Before i answer i think i need a bit of clarification on what exactly a $\beta$-model is. On the face of it NFU cannot have a model which speaks about wellfoundedness without forked tongue, beco's the ordinals of any model of NF(U) are illfounded. That is to say, it is a theorem of NF(U) that there is (an explicitly definable) proper class of ordinals with no least member. (It may be that Rosser-Wang ``Nonstandard models for formal logics'' addresses your interests .. JSL some time early 1950's - they discuss NF in some detail) But it may be that you mean something subtly different.