I am currently digging deep into some optimizations on the classical iterative approaches to both DFS and BFS algorithms. The material I'm currently using at my University presents both iterative approaches as follows.
G(V,E): Connected graph on which the algorithm will be run, with a set V of vertices and a set E of edges.
A[1...n]: Array of adjacency lists for each vertex in graph (A[k] is the linked list containing the adjacent vertices of vertex k)
v[1...n]: Array to track visited vertices
u[1...n]: Array to track active vertices (vertices which have entered the data structure in use)
These are the proposed algorithms in the material of my Universitie's course:
BFS-Iterative(A,v,u,i): // on vertex i
queue <- new Queue()
markInUse(u,i) // sets u[k] = 1
queue.enqueue(i)
while not queue.empty():
k <- queue.dequeue()
markAsVisited(v,k)
for j in A[k]:
if isNotInUse(u,j) and isNotVisited(v,j): // checks if u[k] = 1
markInUse(u,j)
queue.enqueue(j)
The space complexity here is O(|V|), thanks to the use of the "in-use" array u, which ensures that no vertex (index) is pushed more than once into the queue.
DFS-Iterative(A,v,i): // on vertex i
stack <- new Stack()
stack.push(i)
while not stack.empty():
k <- stack.pop()
if isNotVisited(v,k): // checks if v[k] = 1
markVisited(v,k) // sets v[k] = 1
for j in A[k]:
if isNotVisited(v,j):
stack.push(j)
According to the materials, the space complexity here is O(|E|), and it can be improved by emulating the same behaviour as the recursive algorithm by pushing iterators into the stack (or pointers to the lists) instead of all neighbours of a specific vertex, which I also implemented successfully (not shown here).
The materials state that the use of the "trick" of using an array to keep track of the verteces "in-use" in the vector u does NOT work for the DFS, and it is left to the reader to find out why.
Why would the following modified algorithm using the "in-use" array method not work? This is my take on the implementation, since the material does not show this algorithm, it only mentions it. I ran it on a couple of simple examples and it worked well. What am I missing here?
DFS-Iterative-Modified(A,v,u,i): // same structure as BFS-Iterative-Modified
stack <- new Stack() // same as above, using now stack instead of queue
markInUse(u,i)
stack.push(i)
while not stack.empty():
k <- stack.pop()
if isNotVisited(v,k):
markAsVisited(v,k)
for j in A[k]:
if isNotInUse(u,j):
markInUse(u,j)
stack.push(j)
It does seem that this could also improve the DFS algorithm from O(|E|) to O(|V|), but maybe I just need a counter example for me to understand why it's wrong.
There are some bugs in your implementation, but I'll pretend they're not there to answer your question...
By pushing children and marking them as "in use" when you visit the parent, you can mess up the DFS order if those children would otherwise appear earlier.
Given this graph, for example:
B--C
/
A-D
\
C
Your algorithm could visit nodes in order A,B,D,C, but this is NOT a DFS order. This happens, because, when you visit A, all of its children are pushed and marked in use. When B is visited, C cannot be pushed because it's marked in use, and so it might not be visited before D.