I'm learning about graph search and tree search recently and I see many examples mention something like "the path returned by xxx graph search is..." However, graph search is a traversal method, how can it return a path?
We know in tree search, each node corresponds to a particular path and by visiting a node, we know the corresponding path. I'm familiar with tree search enough, but, in graph search, the intention is to visit the destination, rather than find a path to the destination. Then, how can we define a path returned by graph search?
For example, below is a graph.
Graph
B
/ \
/ \
Start Destination
\ /
\ /
C
When we do BFS on start, the visiting sequence is Start-B-C-Destination
or Start-C-B-Destination
. Then, what should be the returned path? If we are using a queue, we can say we enqueue Destination
when visiting B
(or C
) so the returned path is Start-B-Destination
(or Start-C-Destination
), then what about the following graph?
Graph
B
/ \
/ \
Start-----D-----Destination
\ /
\ /
C
Clearly, D
is enqueued when visiting Start
, so the only possible path should be Start-D-Destination
?
There are more problems in A* search. If a node is visited not from the shortest path (due to poor heuristic function), then what should be the final returned path? For example, in the below graph, the visiting sequence is Start-B-D-C-Destination
, then what should be the returned path? Start-B-D-Destination
or Start-C-D-Destination
? We visited D
when B
is in the fringe, however, when we calculate the cost from Start
to D
, we are calculating Start-C-D
and the cost should be 2 rather than 3.
Graph
B
/ \
1 2
/ \
Start D--5--Destination
\ /
1 1
\ /
C
h(B) = 2, h(C) = 4, h(D) = 1
I think all these problems are caused by the intention of graph search isn’t to find a path and graph search will lead to ambiguity when selecting which node should be included in the final path.
Thanks for you patience and I’ll appreciate it a lot if anyone can give me an answer.
I will use your example:
B
/ \
1 2
/ \
Start D--5--Destination
\ /
1 1
\ /
C
Note that a pure Breadth-First Search pays no attention to edge weights (or cost). In this case, the first path that takes it to the destination might be Start->B->D->Destination
.
It marks each node it visits with a boolean flag to indicate that the node has been visited, so that it doesn't pursue Start->B->Start->C->Start->...
In order to remember the path that succeeded, we must store a little more information somewhere. One simple approach is to mark each node, not just with the visited
flag, but with a reference to the node from which it was visited. So if the visitation order was Start, B, C, D, Destination
, the graph will look like this:
B(Start)
/ \
1 2
/ \
Start D(B)--5--Destination(D)
\ /
1 1
\ /
C(Start)
Then we can backtrack. How did we get to Destination
? From D
. How did we get to D
? From B
. How did we get to B
? from Start
.
Breadth-First Search follows the first edge in the queue, and finds the shortest path (in the sense of smallest number of edges). If we want the least-weight (cheapest) path, we modify the search to keep track of cost-to-get-to-this-node, and follow the edge that reaches a new node most cheaply. (This is one version of Dijkstra's Algorithm.) Then in this case the visitation order will be the same, but we will visit D
from C
, not from B
, and get the path Start->C->D->Destination
.
With A*, using h(B) = 2, h(C) = 4, h(D) = 1, the visitation order is Start, B, D, C, D, Destination
, and the path is Start->C->D->Destination
.