This is about an algorithm that is narrated in one of a prescribed text book for undergraduates who has taken datastructure as one of their subjects.
What i understood from the below algorithm is that
First you start from a node push it into notvisited stack
explore each successor node and push those node to a stack named as
nonvisited if they are not visited
poping each node from unvisited stack
Then you recursively follow this procedure until you finish traversing the graph fully
My doubt is in this line
if(vertex is not present on Visited ) Then add vertex on Visited
My understanding is
Since we are taking nodes from nonvisited stack to be inserted to the queue why do we need this line
if(vertex is not present on Visited ) Then add vertex on Visited
as a vertex cannot coexist simultaneosly at the same time in noth notvisited stack and Visited Queue
Algorithm DFS(firstVertex)
{
add firstVertex on notVisited;
place NULL on Visited;
while (notVisited is not empty)
{
remove vertex from notVisited;
generate successors of vertex;
for each successor of vertex do
{
if (successor is not present on Visited and successor is not present on notVisited)
Then add vertex on front of notVisited;
}
if (vertex is not present on Visited)
Then add vertex on Visited;
}
}
You are right that the condition of the following statement
if (vertex is not present on Visited) Then add vertex on Visited;
...is always going to be true. A vertex is not supposed to be in both notVisited
and Visited
at the same time, and since we had just removed vertex
from notVisited
it cannot be in Visited
at that time.
There is however a boundary case if the graph could have an edge from a node to itself (a loop). In that case vertex
could be a successor of itself. In the first iteration of the loop, such vertex
is neither in notVisited
, nor in Visited
, and is added to the notVisited
queue again. Once the inner loop has finished, vertex
will be added to Visited
, so that now it is in both collections! This means an extra iteration of the outer loop will be executed for that same vertex
(luckily not leading to an infinite loop), but then the quoted if
condition will be false for that vertex
.
The correct way is to move vertex
immediately from notVisited
to Visited
, before the inner loop starts.
So:
Algorithm DFS(firstVertex)
{
add firstVertex on notVisited;
place NULL on Visited;
while (notVisited != NULL)
{
remove vertex from notVisited;
add vertex on Visited; // <---
generate successors of vertex;
for each successor of vertex do
{
if (successor is not present on Visited and successor is not present on notVisited)
Then add vertex on front of notVisited;
}
}
}
Now we can be sure that during the execution of the algorithm, a (reachable) vertex will always go through the following phases, in this order:
notVisited
nor in Visited
notVisited
Visited
You mention that notVisited
is a queue, but for efficiency it would be better if it were a hash set, such that checking whether a vertex is in that collection has an amortised constant time complexity (not needing to iterate over the whole queue).
For the same reasons, visited
should not need to be a stack, but can be a hash set as well.