I'm pretty sure, that it is impossible to find a (pesimistically) faster algorithm then $O(n^2)$. Suppose that you have an algorithm (lets call it $G$), that is able to sort your vertices $V_i$ in a topological manner. I can give you a "bad player" (call it $B$), which will create a counterexample for its strategy, forcing him to ask exactly $\frac{n(n-1)}{2}$ times about the edges.
Suppose that $G$ does not ask any questions, and figures out some ordering $\sigma$, and $||V||=n$.
We are looking at its answer $\sigma(1)$, which is an ID of the verticle from $S_0$. $B$ constructs the graph, where any given verticle (lets say - the one with the smallest $id$ different then $\sigma(1)$ which does not form a cycle), has an outgoiing edge to the $\sigma(1)$. This way $G$ has to ask for this knowledge before making an ordering.
As a result, there is a new $\sigma(1)$ value, so $B$ repeats its move.
...
It is easy to see, that after exactly $\frac{n(n-1)}{2}$ questions, there are no more vertices that we can connect to the particular $\sigma(1)$ and so $B$ cannot make $G$'s life harder anymore. As a result, $G$ constructed correct ordering, but had to ask $O(n^2)$ questions, qed.
Modelling this as a game may look counter intuitive at the begining, but it is actualy quite natural approach. $B$ player is simply our method of thinking about worst case scenario for any possible algorithm $G$. In the same time we have just shown, that for each algorithm, there is such an input graph, that will require asking about each pair of vertices at least once (not counting transpoitions). It doest not mean, that an algorithm cannot run faster, if you think about it in terms of parallel programming, you could do much better. But for one "thread" approach, you need $O(n^2)$ time in the worst case. Of course there can be an algorithm which in mean case (or even in 99,9% of cases) performs much faster.
Yes, that is out I would refer to the successors of $v$ as $N^{+}(v)$ and the predecessors of $v$ as $N^{-}(v)$. After all, successors is just a different word for out-neighborhood.
For the successors of the successors, I would use $N^{+}(N^{+}(v))$. You could shorthand it to to ${N^{+}}^{2}(v)$ or a different notation you define.
Edit: For the set of all successors of $v$ in graph $G$, i.e. the set of all vertices reachable from $v$, I've seen this set denoted as $R_{G}(v)$ or $R_{G}^{+}(v)$. The set of predecessors, i.e. the set of vertices that can reach $v$, is denoted $R_{G}^{-}(v)$.
Best Answer
Bollobas uses the notation $\vec{E}(S_1,S_2)$.