Search code examples
c++algorithmrecursiongraphtopological-sort

Transforming recursive DFS-based topological sort into a non-recursive algorithm (without losing cycle detection)


Here is a pseudocode for topological sort from Wikipedia:

L ← Empty list that will contain the sorted nodes
while there are unmarked nodes do
    select an unmarked node n
    visit(n) 
function visit(node n)
    if n has a temporary mark then stop (not a DAG)
    if n is not marked (i.e. has not been visited yet) then
        mark n temporarily
        for each node m with an edge from n to m do
            visit(m)
        mark n permanently
        unmark n temporarily
        add n to head of L

I want to write it non-recursively without losing cicle detection.

The problem is I don't know how to do that and I thought of many approaches already. Basically the problem is to do DFS but with remembering the "current path" (it corresponds to "temporary marking" certain nodes in pseudocode above). So traditional approach with a stack gives me nothing because when using a stack (and putting neighbors of every node in it) I'm putting nodes there even though I will see them "in the undetermined future" and I only want to keep track of nodes "on my current path" (I see it as walking through a maze with a thread I'm leaving behind me - when I see a dead end, I turn back and "wrap the tread" when doing that and at any point in time I want to remember nodes "with thread lying on them" and nodes on which the thread has been at least once). Any tips that would point me in the right direction? I mean - should I think of using 2 stacks instead of 1, maybe some other data structure?

Or maybe this algorithm is OK and I should leave it in its recursive form. I'm only worrying about exceeding the "recursion depth" for sufficiently large graphs.


Solution

  • Obviously, you'd use a stack but you wouldn't put all adjacent nodes anyway: that would yield a DFS with the wrong size complexity anyway (it would be quadratic in the number of nodes assuming non-parallel edges, otherwise potentially worse). Instead, you'd store the current node together with a state indicating the next node to be visited. You'd always work off the stack's top, i.e., something like this:

    std::stack<std::pair<node, iterator> stack;
    stack.push(std::make_pair(root, root.begin()));
    while (!stack.empty()) {
        std::pair<node, iterator>& top = stack.top();
        if (top.second == top.first.begin()) {
            mark(top.first);
            // do whatever needs to be done upon first visit
        }
        while (top.second != top.first.end() && is_marked(*top.second)) {
            ++top.second;
        }
        if (top.second != top.first.end()) {
            node next = *top.second;
            ++top.second;
            stack.push(std::make_pair(next, next.first());
        }
        else {
            stack.pop();
        }
    }
    

    This code assumes that nodes have a begin() and end() yielding suitable iterators to iterate over adjacent nodes. Something along those lines, possibly with an indirection via edges will certainly exist. It also assumes that there are functions available to access a node's mark. In a more realistic imlementation that would probably use something a BGL property map. Whether a std::stack<T> can be used to respresent the stack depends on whether the nodes currently on the stack need to be accessed: std::stack doesn't provide such access. However, it is trivial to create a suitable stack implementation based on any of the STL sequence containers.