Search code examples
algorithmgraphdepth-first-searchgraph-traversal

What are some true Iterative Depth First Search implementation suggestions?


So from what I am seeing I have been thaught like most people that the iterative version of DFS is just like iterative BFS besides two differences: replace queue with stack and mark a node as discovered after POP not after PUSH.

Two questions have really been puzzling me recently:

  • For certain cases it will result in different output, but is it really necessary to mark a node as visited after we POP? Why is it wrong to do it after we PUSH? As far as I can see this will result in the same consequence as to why we do it for BFS...to not have duplicates in our queue/stack.
  • Now the big one: My impression is that this kind of implementation of iterative DFS is not true DFS at all: If we think about the recursive version it is quite space efficient since it doesnt store all the possible neighbours at one level(as we would do in the iterative version), it only selects one and goes with it and then it backtracks and goes for a second one. As an extreme example think of a graph with one node in the center and conected to 100 leaf nodes. In the recursive implementation if we start from the middle node the underlying stack will grow to maximum 2 ... one for the middle node and one for every leaf it visits. If we do it as we have been thaught with the iterative version the stack will grow to 1000 elements. That doesnt seem right.

So with all those previous details in mind, my question is what would the approach be to have a true iterative DFS implementation?


Solution

  • Question 1: If you check+mark before the push, it uses less space but changes the order in which nodes are visited. You will visit everything, though.

    Question 2: You are correct that an iterative DFS will usually put all the children of a node onto the stack at the same time. This increases the space used for some graphs, but it doesn't change the worst case space usage, and it's the easiest way so there's usually no reason to change that.

    Occasionally you know that it will save a lot of space if you don't do this, and then you can write an iterative DFS that works more like the recursive one. Instead of pushing the next nodes to visit on the stack, you push a parent and a position in its list of children, or equivalent, which is pretty much what the recursive version has to remember when it recurses. In pseudo-code, it looks like this:

    func DFS(start):
        let visited = EMPTY_SET
        let stack = EMPTY_STACK
    
        visited.add(start)
        visit(start)
        stack.push( (start,0) )
    
        while(!stack.isEmpty()):
            let (parent, pos) = stack.pop()
            if (pos < parent.numChildren()):
    
                let child = parent.child[pos]
                stack.push(parent,pos+1)
    
                if (!visited.contains(child)):
                    visited.add(child)
                    visit(child)
                    stack.push( (child,0) )
    

    You can see that it's a little more complicated, and the records you push on the stack are tuples, which is annoying in some cases. Often we'll use two stacks in parallel instead of creating tuples to push, or we'll push/pop two records at a time, depending on how nodes and child list positions have to be represented.