The problem: finding the path to the closest of multiple goals on a rectangular grid with obstacles. Only moving up/down/left/right is allowed (no diagonals). I did see this question and answers, and this, and that, among others. I didn't see anyone use or suggest my particular approach. Do I have a major mistake in my approach?
My most important constraint here is that it is very cheap for me to represent the path (or any list, for that matter) as a "stack", or a "singly-linked-list", if you want. That is, constant time access to the top element, O(n) for reversing.
The obvious (to me) solution is to search the path from any of the goals to the starting point, using a manhattan distance heuristic. The first path from the goal to the starting point would be a shortest path to the closest goal (one of many, possibly), and I don't need to reverse the path before following it (it would be in the "correct" order, starting point on top and goal at the end).
In pseudo-code:
A*(start, goals) :
init_priority_queue(start, goals, p_queue)
return path(start, p_queue)
init_priority_queue(start, goals, q_queue) :
for (g in goals) :
h = manhattan_distance(start, g)
insert(h, g, q_queue)
path(start, p_queue) :
h, path = extract_min(q_queue)
if (top(path) == start) :
return path
else :
expand(start, path, q_queue)
return path(start, q_queue)
expand(start, path, q_queue) :
this = top(path)
for (n in next(this)) :
h = mahnattan_distance(start, n)
new_path = push(n, path)
insert(h, new_path, p_queue)
To me it seems only natural to reverse the search in this way. Is there a think-o in here?
And another question: assuming that my priority queue is stable on elements with the same priority (if two elements have the same priority, the one inserted later will come out earlier). I have left my next
above undefined on purpose: randomizing the order in which the possible next tiles on a rectangular grid are returned seems a very cheap way of finding an unpredictable, rather zig-zaggy path through a rectangular area free of obstacles, instead of going along two of the edges (a zig-zag path is just statistically more probable). Is that correct?
It's correct and efficient in the big O as far as I can see (N log N as long as the heuristic is admissible and consistent, where N = number of cells of the grid, assuming you use a priority queue whose operations work in log N). The zig-zag will also work.
p.s. For these sort of problem there is a more efficient "priority queue" that works in O(1). By these sort of problem I mean the case where the effective distance between every pair of nodes is a very small constant (3 in this problem).
Edit: as requested in the comment, here are the details for a constant time "priority queue" for this problem.
First, transform the graph into the following graph: Let the potential of nodes in the graph (i.e., cell in a grid) be the Manhattan Distance from the node to the goal (i.e., the heuristic). We call the potential of node i as P(i). Previously, there is an edge between adjacent cells and its weight is 1. In the modified graph, the weight w(i, j) is changed into w(i, j) - P(i) + P(j). This is exactly the same graph as in the proof to why A* is optimal and terminates in polynomial time in the case the heuristic is admissible and consistent. Note that Manhattan Distance heuristic for this problem is both admissible and consistent.
The first key observation is that A* in the original graph is exactly the same with Dijkstra in the modified graph. This is since the "value" of node i in the modified graph is exactly the distance from the origin node plus P(i). The second key observation is that the weight of every edge in our transformed graph is either 0 or 2. Thus, we can simulate the A* by using a "deque" (or a bidirectional linked list) instead of an ordinary queue: whenever we encounter an edge with weight 0, push it to the front of the queue, and whenever we encounter an edge with weight 2, push it to the end of the queue.
Thus, this algorithm simulates A* and works in linear time in the worst case.