The answer to question 1 is affirmative for GU and IN but
negative for TA. That is, the proper class formulation of
TA is not equivalent to TA, unless both are inconsistent.
Asserting that every ordinal is below an inaccessible
cardinal is clearly equivalent in ZF to asserting that the
inaccessible cardinals form a proper class. And since the
Grothendieck universes are exactly the sets $V_\kappa$ for
an inaccessible cardinal $\kappa$, the theory GU over ZFC
is equivalent to the assertion that there are a proper
class of universes. So those cases are relatively straightforward, as you had guessed.
But the case of the Tarski axiom is different, since his
universes are not transitive sets. In fact, if there is a
single inaccessible cardinal $\kappa$, one can already form
a proper class of Tarski sets. To see this, suppose that
$\kappa$ is inaccessible and $x$ is any set. Build a Tarski
set $U$ as follows:
- $U_0=\{\{x\}\}$,
- $U_{\alpha=1}=P(U_\alpha)$ and
- $U_\lambda=\bigcup_\alpha
U_\alpha$ at limit ordinals.
That is, we perform the usual cumulative
hiearchy, but starting with object {x} instead of nothing.
The resulting set $U=U_\kappa$ contains {x} as an element,
has size $\kappa$ and is closed under subsets, power sets
and small unions, so it is a Tarski universe. (Note that if
$y\in U_{\alpha+1}$, then all subsets of $y$ are also in
$U_{\alpha+1}$, and so $P(y)$ is added on the next step.
And {x} has only two subsets, which are both in $U$.) But
for sufficiently large $x$, the resulting $U$ will not be
transitive, since $x$ itself will never be added as an
element. Furthermore, since $x$ was arbitrary, from a
single inaccessible cardinal we can build a proper class of
Tarski universes. And so the proper class formulation of TA
will not be equivalent to TA, unless both theories are
inconsistent, because if the existence of an inaccessible
cardinal is consistent with ZFC, then it is consistent that
there is exactly one such cardinal, by chopping the
universe off at the second one. The resulting model will
satisfy the proper class formulation of TA but not TA
itself.
There is a cheap way of doing this, which may not be the optimal approach when a subtle task (such as the foundational question you have in mind) is the goal. But, then again, this may suffice.
Working in an appropriately strong theory, to simplify, the standard way to check that NBG is conservative over ZFC is to see that any model $M$ of ZFC can be extended to a model $N$ of NBG in such a way that the "sets" of $N$ give us back $M$. Again to simplify, assume the model $M$ is transitive. The model $N$ we associate to it is Gödel's $\mathop{\rm Def}(M)$, the collection of subsets of $M$ that are first order definable in $M$ from parameters (The proper classes are the elements of $\mathop{\rm Def}(M)\setminus M$.)
This suggests the simple solution of defining the models of "iterated-NBG" as the result of iterating Gödel's operation. So, given a transitive model $M$ of ZFC, the $\alpha$-th iterate would simply be what we usually denote $L_\alpha(M)$.
I am restricting to transitive models here, but there is a natural first order theory associated to each stage of the iteration just described (at least, for "many" $\alpha$), and I guess one could try to axiomatize it decently if enough pressure is applied.
There are some subtleties in play here. One is that most likely we want to stop the iteration way before we run into serious technicalities ($\alpha$ would have to be a recursive ordinal, for one thing, but I suspect we wouldn't want to venture much beyond the $\omega$-th iteration). Another is that the objects we obtain with this procedure would have wildly varying properties depending on specific properties of $M$.
For example, if $M$ is the least transitive model of set theory, then we "quickly" add a bijection between $M$ and $\omega$. In general, if $M$ is least with some (first order in the set theoretic universe) property, then we "quickly" add a bijection between $M$ and the size of the parameters required to describe this property (this is an old fine-structural observation. "Quickly" can be made pedantically precise, but let me leave it as is).
So you may want to work not with ZFC proper but with a slightly stronger theory (something like ZFC + "there is a transitive model of ZFC" + "there is a transitive model of "ZFC+there is a transitive model of ZFC"" + ...) if you want some stability on the theory of the transitive models produced this way. (Of course, this is an issue of specific models, not of the "iterated-NBG" theory per se).
I should add that I do not know of any serious work in the setting I've suggested, with two exceptions. One, in his book on Class Forcing, Sy Friedman briefly mentions a version of "Hyperclass forcing" appropriate to solve some questions that appear in a natural fashion once we show, for example, that no class forcing over $L$ can add $0^\sharp$. The second is by Reinhardt in the context of large cardinals and elementary embeddings, and is described by Maddy in her article "Believing the axioms. II". As far as I remember, neither work goes beyond hyperclasses, i.e., classes of classes.
Best Answer
Since the question is rather philosophical (e.g., "right notion"), I'll use it as an excuse to record my philosophical opinions on this topic. The intuition underlying ZFC, i.e., the intuition of the cumulative hierarchy of sets, contains two quite vague notions, (1) the notion of "arbitrary subset" of an infinite set, used at successor stages of the hierarchy, and (2) the notion of iterating "forever", beyond any imaginable bound. Although these ideas are vague, they have consequences that can be expressed precisely, and the point of the ZFC axioms is to express enough of the consequences to serve as a foundation for what mathematician ordinarily do.
To add proper classes to the picture, as in the von Neumann-Bernays-Gödel or Morse-Kelley theories, is to add one more level to the cumulative hierarchy, after all the sets. This is technically useful for some purposes (including some aspects of category theory), but it is incoherent with aspect (2) of the intuition of sets. If it's possible to add one more level, then the hierarchy of sets should have been continued to include that level and many more beyond it.
For this reason, I view ZFC, possibly augmented with (mild) large-cardinal axioms or reflection principles as an intuitively more acceptable foundation than a class theory. I might well use the terminology of proper classes as a convenient abbreviation for statements about sets (e.g., "$V=L$" abbreviates "all sets are constructible", which can be defined in hte purely set-theoretic context of ZFC). But when people make serious use of proper classes, my picture of what they're doing is that their sets are really just sets of rank below some inaccessible cardinal $\kappa$ and their proper classes are really sets of rank $\kappa$. If they need super-classes of classes and even higher-rank collections, that's no problem as far as I'm concerned; the universe of sets stretches way beyond $\kappa$.