This is a code that I got from the site here and I'd like to know whether this implementation of A* is correct. I have looked at it and compare it with the wikipedia page and it seems valid.. The reason why I ask is because in the site it says there is still a bug in this code, I tried finding it but can't find any. I want to change it though so that it takes an origin and destination as input parameter
<?php
class AStarSolver
{
function solve(&$s)
{
include_once('PQueue.class.php');
$o = new PQueue();
$l = array();
$c = array();
$p = array();
$a = $s->getStartIndex();
$z = $s->getGoalIndex();
$d = $s->goalDistance($a);
$n0 = array('g'=>0, 'h'=>$d, 'i'=>$a, 'p'=>NULL, 'f'=>$d);
$o->push($n0, -$d);
$l[$a] = TRUE;
while (! $o->isEmpty())
{
$n = $o->pop();
if ($n['i'] == $z)
{
while ($n)
{
$p[] = $n['i'];
$n = $n['p'];
}
break;
}
foreach ($s->getNeighbors($n['i']) as $j => $w)
{
if ((isset($l[$j]) || isset($c[$j])) && isset($m) && $m['g'] <= $n['g']+$w)
continue;
$d = $s->goalDistance($j);
$m = array('g'=>$n['g']+$w, 'h'=>$d, 'i'=>$j, 'p'=>$n, 'f'=>$n['g']+$w+$d);
if (isset($c[$j]))
unset($c[$j]);
if (! isset($l[$j]))
{
$o->push($m, -$m['f']);
$l[$j] = TRUE;
}
}
$c[$n['i']] = $n;
}
return $p;
}
}
?>
The code to the Pqueue can be found here
The site suggests that the bug might be in the PQueue
class.
In PQueue::pop
this
$j+1 < $m
is a test whether the heap node at $i
has one child (at $j
) or two (at $j
and $j+1
).
But $m
here is count($h)
only on the first iteration through the loop, since the --$m
in the loop condition is evaluated every time.
Move that --$m
next to the array_pop
where it belongs, and that will be one less bug.
Now for AStarSolver
.
The variables are (relative to Wikipedia pseudocode):
$o
– open set as priority queue;$l
– open set as map keyed by index;$c
– closed set as map keyed by index;$n
– current node (x);$m
– neighbour node (y) ?;$j
– neighbour node index.Problems that I see:
$n = $o->pop()
isn't followed by unset($l[$n['i']])
. Since both $o
and $l
represent the same set, they should be kept in sync.
According to Wikipedia the closed set is only used if the heuristic is monotone (and I think a distance heuristic is monotone), and in that case, once a node is added to the closed set, it is never visited again. This code seems to implement some other pseudocode, which does remove nodes from the closed set. I think this defeats the purpose of the closed set, and the first condition in the inner loop should be
if (isset($c[$j]) || isset($l[$j]) && isset($m) && $m['g'] <= $n['g']+$w)
Then we can remove the unset($c[$j])
.
$m['g']
in this condition should be the g-score of the current neighbour indexed by $j
. But $m
has whatever value is left over from the previous loop: the node corresponding to $j
on a previous iteration.
What we need is a way to find a node and its g-score by node index. We can store the node in the $l
array: instead of $l[$j] = TRUE
we do $l[$j] = $m
and the above condition becomes
if (isset($c[$j]) || isset($l[$j]) && $l[$j]['g'] <= $n['g']+$w)
Now the tricky bit. If the node we just found is not in the open set, we add it there (that's the $o->push
and $l[$j] =
).
However, if it is in the open set we just found a better path to it, so we must update it. The code doesn't do this and it's tricky because the priority queue doesn't provide a routine for increasing the priority of an element. However, we can rebuild the priority queue completely and the last bit of code in the inner loop becomes
if (! isset($l[$j])) {
$o->push($m, -$m['f']);
$l[$j] = $m; // add a new element
} else {
$l[$j] = $m; // replace existing element
$o = new PQueue();
foreach ($l as $m)
$o->push($m, -$m['f']);
}
This is not terribly efficient, but it's a starting point. Changing an element in a priority queue isn't efficient anyway, because you first have to find it.
Even without these changes the algorithm does find a path, just not the best path. You can see it in the mentioned mazes:
In the crazy
maze in the third inner circuit: the taken upper path around is slightly longer than the lower path would have been because of the obstacles on the left.
In the big
maze in the upper-right part of the path there's an unnecessary loop up.
Since this was on my mind, I implemented my own version of the algorithm and posted it in an answer to your previous question.