On the subject of categorical versus set-theoretic foundations there
is too much complicated discussion about structure that misses the
essential point about whether "collections" are necessary.
It doesn't matter exactly what your personal list of mathematical
requirements may be -- rings, the category of them, fibrations,
2-categories or whatever -- developing the appropriate foundational
system for it is just a matter of "programming", once you understand
the general setting.
The crucial issue is whether you are taken in by the Great Set-Theoretic
Swindle that mathematics depends on collections (completed infinities).
(I am sorry that it is necessary to use strong language here in order to
flag the fact that I reject a widely held but mistaken opinion.)
Set theory as a purported foundation for mathematics does not and cannot
turn collections into objects. It just axiomatises some of the intuitions
about how we would like to handle collections, based on the relationship
called "inhabits" (eg "Paul inhabits London", "3 inhabits N"). This
binary relation, written $\epsilon$, is formalised using first order
predicate calculus, usually with just one sort, the universe of sets.
The familiar axioms of (whichever) set theory are formulae in first order
predicate calculus together with $\epsilon$.
(There are better and more modern ways of capturing the intuitions about
collections, based on the whole of the 20th century's experience of algebra
and other subjects, for example using pretoposes and arithmetic universes,
but they would be a technical distraction from the main foundational issue.)
Lawvere's "Elementary Theory of the Category of Sets" axiomatises some
of the intuitions about the category of sets, using the same methodology.
Now there are two sorts (the members of one are called "objects" or "sets"
and of the other "morphisms" or "functions"). The axioms of a category
or of an elementary topos are formulae in first order predicate calculus
together with domain, codomain, identity and composition.
Set theorists claim that this use of category theory for foundations
depends on prior use of set theory, on the grounds that you need to start
with "the collection of objects" and "the collection of morphisms".
Curiously, they think that their own approach is immune to the same
criticism.
I would like to make it clear that I do NOT share this view of Lawvere's.
Prior to 1870 completed infinities were considered to be nonsense.
When you learned arithmetic at primary school, you learned some rules that
said that, when you had certain symbols on the page in front of you,
such as "5+7", you could add certain other symbols, in this case "=12".
If you followed the rules correctly, the teacher gave you a gold star,
but if you broke them you were told off.
Maybe you learned another set of rules about how you could add lines and
circles to a geometrical figure ("Euclidean geometry"). Or another one
involving "integration by parts". And so on. NEVER was there a "completed
infinity".
Whilst the mainstream of pure mathematics allowed itself to be seduced
by completed infinities in set theory, symbolic logic continued and
continues to formulate systems of rules that permit certain additions
to be made to arrays of characters written on a page. There are many
different systems -- the point of my opening paragraph is that you can
design your own system to meet your own mathematical requirements --
but a certain degree of uniformity has been achieved in the way that they
are presented.
We need an inexhaustible supply of VARIABLES for which we may substitute.
There are FUNCTION SYMBOLS that form terms from variables and other terms.
There are BASE TYPES such as 0 and N, and CONSTRUCTORS for forming new
types, such as $\times$, $+$, $/$, $\to$, ....
There are TRUTH VALUES ($\bot$ and $\top$), RELATION SYMBOLS ($=$)
and CONNECTIVES and QUANTIFIERS for forming new predicates.
Each variable has a type, formation of terms and predicates must respect
certain typing rules, and each formation, equality or assertion of a
predicate is made in the CONTEXT of certain type-assignments and
assumptions.
There are RULES for asserting equations, predicates, etc.
We can, for example, formulate ZERMELO TYPE THEORY in this style. It has
type-constructors called powerset and {x:X|p(x)} and a relation-symbol
called $\epsilon$. Obviously I am not going to write out all of the details
here, but it is not difficult to make this agree with what ordinary
mathematicians call "set theory" and is adequate for most of their
requirements
Alternatively, one can formulate the theory of an elementary topos is this
style, or any other categorical structure that you require. Then a "ring"
is a type together with some morphisms for which certain equations are
provable.
If you want to talk about "the category of sets" or "the category of rings"
WITHIN your tpe theory then this can be done by adding types known as
"universes", terms that give names to objects in the internal category
of sets and a dependent type that provides a way of externalising
the internal sets.
So, although the methodology is the one that is practised by type theorists,
it can equally well be used for category theory and the traditional purposes
of pure mathematics. (In fact, it is better to formalise a type theory
such as my "Zermelo type theory" and then use a uniform construction to
turn it into a category such as a topos. This is easier because the
associativity of composition is awkward to handle in a recursive setting.
However, this is a technical footnote.)
A lot of these ideas are covered in my book "Practical Foundations of
Mathematics" (CUP 1999), http://www.PaulTaylor.EU/Practical-Foundations
Since writing the book I have written things in a more type-theoretic
than categorical style, but they are equivalent. My programme called
"Abstract Stone Duality", http://www.PaulTaylor.EU/ASD is an example of the
methodology above, but far more radical than the context of this question
in its rejection of set theory, ie I see toposes as being just as bad.
I'm not entirely sure what you're looking for in an answer, but maybe I'll flesh out my comment.
It looks like what you're describing is equivalent to the homotopy category associated to the model structure on Cat where the weak equivalences are equivalences of categories. (I can say "the" because there is only one such, as pointed out in the comments. The cofibrations are functors injective on objects, and the fibrations are "isofibrations".)
I would say that in this context your category has been much studied. In particular, it is interesting to ask questions about homotopy limits and colimits in this category because many useful constructions arise in this way. (Homotopy (co)limits with this model structure are the same as "2-(co)limits" which is the name appearing in most of the literature, especially older literature.)
An example application of this language is the following theorem: The subcategory of presentable (resp. accessible) categories is closed under homotopy limits.
Using this one can prove that most of your favorite things are presentable (resp. accessible). For example, the category of modules over a monad arises via a homotopy limit construction, and this takes care of most things of interest.
Here's a neat application of this (which is the ordinary category version of a result that can be found, for example, in Lurie's HTT, 5.5.4.16.).
Say you want to localize a category $\mathcal{C}$ with respect to some collection of morphisms, $S$. Usually $S$ will not be given as a set, but if $\mathcal{C}$ is presentable you're usually okay if $S$ is generated by a set. Well, it turns out that if $F: \mathcal{C} \rightarrow \mathcal{D}$ is a colimit preserving functor between presentable categories, and $S$ is a (strongly saturated) collection of morphisms in $\mathcal{D}$ that is generated by a set, then $f^{-1}S$ is a (strongly saturated) collection of morphisms generated by a set. The argument goes by way of showing that the subcategory of the category of morphisms generated by $f^{-1}S$ is presentable, using a homotopy pullback square.
Adapting this to the model category or $\infty$-category setting, one sees immediately that localizing with respect to homology theories is totally okay and follows formally from this type of argument. (Basically, after fiddling around with cells to prove the category of spectra is presentable, you don't have to fiddle any more to get localizations. This is in contrast to the usual argument found in Bousfield's paper. You've moved the cardinality bookkeeping into a general argument about homotopy limits of presentable categories.)
Anyway, apologies for the very idiosyncratic application of this language; these things have been on my mind recently. I'm sure there are much more elementary reasons why one would care about using the model category structure on Cat.
Best Answer
My personal opinion is that one should consider the 2-category of categories, rather than the 1-category of categories. I think the axioms one wants for such an "ET2CC" will be something like:
Once you have all this, you can use finite 2-categorical limits and the "internal logic" to construct all the usual concrete categories out of the object "set". For instance, "set" has finite products internally, which means that the morphisms $set \to 1$ and $set \to set \times set$ have right adjoints in our 2-category Cat (i.e. "set" is a "cartesian object" in Cat). The composite $set \to set\times set \to set$ of the diagonal with the "binary products" morphism is the "functor" which, intuitively, takes a set $A$ to the set $A\times A$. Now the 2-categorical limit called an "inserter" applied to this composite and the identity of "set" can be considered "the category of sets $A$ equipped with a function $A\times A\to A$," i.e. the category of magmas.
Now we have a forgetful functor $magma \to set$, and also a functor $magma \to set$ which takes a magma to the triple product $A\times A \times A$, and there are two 2-cells relating these constructed from two different composites of the inserter 2-cell defining the category of magmas. The "equifier" (another 2-categorical limit) of these 2-cells it makes sense to call "the category of semigroups" (sets with an associative binary operation). Proceeding in this way we can construct the categories of monoids, groups, abelian groups, and eventually rings.
A more direct way to describe the category of rings with a universal property is as follows. Since $set$ is a cartesian object, each hom-category $Cat(X,set)$ has finite products, so we can define the category $ring(Cat(X,set))$ of rings internal to it. Then the category $ring$ is equipped with a forgetful functor $ring \to set$ which has the structure of a ring in $Cat(ring,set)$, and which is universal in the sense that we have a natural equivalence $ring(Cat(X,set)) \simeq Cat(X,ring)$. The above construction then just shows that such a representing object exists whenever Cat has suitable finitary structure.
One can hope for a similar elementary theory of the 3-category of 2-categories, and so on up the ladder, but it's not as clear to me yet what the appropriate exactness properties will be.