[Math] Algorithm for topological sorting without explicit edge list

algorithmscomputational complexitygraph theory

Suppose I have a set of vertices $V$ and a function $f(V_1, V_2)$ which given two vertices returns +1 if there is an edge from $V_1$ to $V_2$, -1 if there is an edge from $V_2$ to $V_1$, and 0 otherwise.

Then let $S_0$ be the set of vertices with no incoming edges, $S_1$ be the set of vertices with incoming edges from vertices in $S_0$ only, $S_2$, be the set of vertices with incoming edges from vertices in $S_0$ and $S_1$, etc.

I'm looking for an algorithm that can sort the set $V$ so that every vertex in $S_0$ is placed before every vertex in $S_1$, and so on. I already know it's related to topological sorting but computing the whole set of edges for use in either of the algorithms given in Wikipedia would take $O(n^2)$ time.

I've written my own algorithm which uses the function $f()$ directly instead of precomputing the edges but it also takes $O(n^2)$ time. My question is: is it possible to do any better than $O(n^2)$ time? I'd be very grateful even to have one with a lower best-case running time.

–edit

Assume that $f()$ is guaranteed to produce a directed acyclic graph.

Best Answer

I'm pretty sure, that it is impossible to find a (pesimistically) faster algorithm then $O(n^2)$. Suppose that you have an algorithm (lets call it $G$), that is able to sort your vertices $V_i$ in a topological manner. I can give you a "bad player" (call it $B$), which will create a counterexample for its strategy, forcing him to ask exactly $\frac{n(n-1)}{2}$ times about the edges.

Suppose that $G$ does not ask any questions, and figures out some ordering $\sigma$, and $||V||=n$.

We are looking at its answer $\sigma(1)$, which is an ID of the verticle from $S_0$. $B$ constructs the graph, where any given verticle (lets say - the one with the smallest $id$ different then $\sigma(1)$ which does not form a cycle), has an outgoiing edge to the $\sigma(1)$. This way $G$ has to ask for this knowledge before making an ordering.

As a result, there is a new $\sigma(1)$ value, so $B$ repeats its move.

...

It is easy to see, that after exactly $\frac{n(n-1)}{2}$ questions, there are no more vertices that we can connect to the particular $\sigma(1)$ and so $B$ cannot make $G$'s life harder anymore. As a result, $G$ constructed correct ordering, but had to ask $O(n^2)$ questions, qed.

Modelling this as a game may look counter intuitive at the begining, but it is actualy quite natural approach. $B$ player is simply our method of thinking about worst case scenario for any possible algorithm $G$. In the same time we have just shown, that for each algorithm, there is such an input graph, that will require asking about each pair of vertices at least once (not counting transpoitions). It doest not mean, that an algorithm cannot run faster, if you think about it in terms of parallel programming, you could do much better. But for one "thread" approach, you need $O(n^2)$ time in the worst case. Of course there can be an algorithm which in mean case (or even in 99,9% of cases) performs much faster.

Related Question