My view is that the large cardinal hierarchy already has all the
principal features of the project you are proposing.
Each of the large cardinal concepts can be regarded as
corresponding to a certain conception of the set-theoretic
universe, if one should entertain the von Neuman hierarchy up to
such a cardinal, and this makes a perfectly good universe. Every inaccessible cardinal $\kappa$, for
example, gives rise $V_\kappa$, a transitive model of ZFC and a
Grothendieck universe in fact. Every Mahlo cardinal $\lambda$ is a
limit of many inaccessible cardinals $\kappa\lt\lambda$, and the
models $V_\kappa\subset V_\lambda$ have much the same relation as
what you describe in your question. If $\lambda$ is Mahlo, then
the smaller models $V_\kappa$ for inaccessible $\kappa\lt\lambda$,
which are perfectly good set theoretic universes, each extend up
to $V_\lambda$, a larger universe having what it thinks is a
proper class of inaccessible cardinals (and hence also the
Universe Axiom). Indeed, when $\lambda$ is Mahlo then the
collection of inaccessible cardinals is not merely unbounded in
$\lambda$, as you request, but also forms what is known as a
stationary class in $\lambda$, meaning that it intersects
nontrivially with every closed unbounded set. This seems to extend
and refine the idea of your cofinal tallness. Similarly, every
weakly compact cardinal is a stationary limit of Mahlo cardinals,
and if $\delta$ is a measurable cardinal, then not only are the
weakly compact cardinals below $\delta$ stationary in $\delta$,
but they form a set of normal measure one, a much stronger notion.
This reflection phenomenon is nearly universal in the large
cardinal hierarchy, where properties of the larger large cardinals
reflect down to robust classes of the smaller cardinals. The
strong cardinals reflect in this way down to the measurable
cardinals, and the Mitchell order carries this idea still further.
Supercompactness reflects down to superstrongness. It is an
intensely studied phenomenon.
In this sense, the subject of large cardinal set theory is already
undertaking your project. What we are studying is precisely how all the various large cardinals can be construed as smaller universes extending into larger ones. For the large cardinals that are axiomatized in terms of the existence of certain embeddings $j:V\to M$, this extension process proceeds in two ways: $M$ is larger than $V$ in the sense that $\text{ran}(j)\subset M$, and $M$ is smaller than $V$ in the sense that $M\subset V$. It is the interplay and tension between these two sense that gives rise to much of the power of these axioms.
I would say that this includes elements of algebra, broadly
construed, if one regards the direct limits and large systems of
large cardinal embeddings that arise in the theory as having an
essentially algebraic aspect. Surely the extender embeddings
concepts developed in the theory of canonical inner models exhibit
a fundamentally algebraic character.
And the subject is hugely involved with philosophical
considerations, which guide the choice of new large cardinal
axioms as well as motivate or attempting to explain why we should
believe that they are consistent or true. One can say infinitely
more about this.
You ask: Has anyone seriously explored what mathematics would look like in the absence of PUC?
Actually, I believe I have done so in my PhD thesis modern intuitionistic topology. Don't let the title mislead you: more than half of the thesis is actually directly acceptable in BISH, and these results are all marked with an asterisk.
In my thesis, I define 'function' precisely according to what you describe, which is the ZF-definition. I have always considered this set-theoretic approach very convenient for a fundamental reunion of Bishop's and Brouwer's perspective. The usual dictum that BISH is a common core of INT and CLASS is actually untrue, since much of Bishop's definitions are too vague and therefore unacceptable in INT. But they can be naturally interpreted when using the ZF-definition, which I adopted in my thesis.
In my thesis, pretty large samples of constructive math are developed using this definition. (I would not call it large swaths, but I do not see any obstruction to expand these samples to large swaths). I can't recall even one instance where PUC would help one to prove more than what can be proven using the ZF-definition.
Just an example: the Dugundji Extension Theorem is proved in this way in chapter three.
Of course, I do use restricted forms of countable choice and dependent choice in my thesis. But that seems a separate issue from what you are asking.
To then specialize to INT or RUSS (recursive mathematics), I found (and find) it very convenient to define special functions, such as spread-functions and computable functions. One shows that these special functions indeed also specify functions (ZF-style), and then one can prove additional INT or RUSS results using either INT or RUSS choice axioms.
So I'm fairly confident in my answer that yes, large parts of constructive mathematics can be developed without PUC.
Update to reflect the comments below:
In my thesis the definition of function (0.1.3) essentially runs like this:
Let $(X,\#_1), (Y,\#_2)$ be apartness spaces, then $f\subset X\times Y$ is called a function iff:
(i) $\forall x\in X\ \exists y\in Y\ [\,(x,y)\in f\,]$
(ii) $\forall x,w\in X\ \forall y,z\in Y\ [\,((x,y)\in f\wedge(w,z)\in f\wedge y\#_2 z) \rightarrow x\#_1 w\,]\,$
With this definition of function, large samples of constructive math are developed, like I said. This does not require any form of PUC, and it is debatable whether PUC even holds in general with this definition.
That depends on how literal one wants to interpret PUC, and whether one wishes to include equivalence/apartness relations which are not the 'simple' Baire space equivalence/apartness.
But even to say that with this definition PUC holds for Baire space, is to interpret the quantifier $\exists!$ in a very specific way. Because one cannot derive (ii) from the usual interpretation of $\forall x \exists! y$.
Second update to reflect further comments below:
First I would like to recommend the OP's paper Linear logic for constructive mathematics. I find it well written and it gives a promising (meta-)insight how to resolve annoying deficiencies of constructive logic, especially when constructivizing classical definitions.
Second, let me repeat that the above definition of function is far from the usual one in BISH (which is very vague!, see chptr. 2 def. 1.1 of Constructive Analysis [Bishop&Bridges1985]). It therefore stands alone in BISH, and it is not a trivial result that with this definition large parts of BISH can be developed.
Moreover, in BISH there is a real necessity for PUC as an axiom, since it does not follow at all from the definition of function and the other axioms. I give the following quotes (page 12, chptr. 1 'A constructivist manifesto' written in 1967):
A choice function exists in constructive mathematics, because a choice
is implied by the very meaning of existence.
And a later addendum (1985) by Bridges (page 13, chptr 1. Notes):
At first sight, Bishop' remark, "A choice function exists in constructive mathematics, because a choice is implied by the very meaning
of existence", appears to be contradicted by counterexamples of the
sort discussed in connection with the least-upper-bound principle. In
fact, there is no contradiction here. To see this, consider a
paraphrase of Bishop's remark: if to each $x$ in a set $A$ there
corresponds an element $y$ of a set $B$ such that a given property $P(x, y)$
holds, then it is implied by the very meaning of existence in
constructive mathematics that there is a finite routine for computing
an appropriate $y \in B$ from a given $x \in A$; although this routine may
not be a function relative to the given equality relation on $A$, it is
a function relative to the equality relation of identity (intensional
equality) on A, in which two elements are equal if and only if they
are given as identically the same object.
With the relational definition given in my thesis, there is no need for PUC as an axiom, like I said. And this remains so even if one interprets the quantifier $\exists!$ in such a way that PUC is validated for the natural apartness on Baire space. This natural apartness is not an extra structure, but the intuitionistic (and constructive!) way of handling equality. It corresponds rather precisely I believe to the linear-logic way of defining inequality (in the OP's paper mentioned above).
Best Answer
To give one answer to the question in the title: A reason to prefer one way of defining the same 'idea' over another is that it generalizes better. (Where of course what better means can depend.)
I have nothing to say about equivalence relations but since also other examples are asked for:
The notion 'prime number' can perhaps serve as an example for what I mean.
For the natural numbers one can define this in various ways equivalent ways. In particular (we exclude $1$, and $0$ if one considers it as natural number):
A number $p$ is a prime number if $p=ab$ implies $p=a$ or $p=b$. (Or, put differently, $p$ is only divisible by $1$ and itself.)
A number $p$ is a prime number if $p\mid ab$ implies $p\mid a$ or $p\mid b$.
These are equivalent for the natural numbers. But, I am convinced that the latter is the better definition of prime number. (Though, to answer side questions, when teaching introductory things I might use the former, since students might be already familiar with this being the definition and anyway I will not have the time to convey why the other is better. Also, I think this conviction is not universal (now) and certainly was not earlier.)
So, why am I convinced about this. Because, if you part from the natural number/integers to more general things, say rings of algebraic integers, Dedekind domains, domains in general, it is this definition that generalizes better to yield a notion of prime element.
The property given by the former is also interesting in more general situations, but I/one(?) would not call such an element prime but rather irreducible, for example.
Thus, in this case I would not say there are two definition for the 'idea' prime number that are equivalent.
But, rather there are really two 'ideas', the one of an irreducible element and the other of a prime element, and for the natural numbers (and the integers) one can show that they coincide.
The former definition yields the former, and the latter the latter. And, the latter being somehow more pertinent, as also documented by naming, this is the definition of prime number (in the naturals) that I prefer, since it generalizes better.