Search code examples
algorithmgraphpseudocodea-starconsistency

Is this A* pseudocode taking the condition of admisible ad consistent heuristic function hypothesis for granted?


This is the pseudocode of A* from wiki (https://en.wikipedia.org/wiki/A*_search_algorithm):

function reconstruct_path(cameFrom, current)
    total_path := {current}
    while current in cameFrom.Keys:
        current := cameFrom[current]
        total_path.prepend(current)
    return total_path

// A* finds a path from start to goal.
// h is the heuristic function. h(n) estimates the cost to reach goal from node n.
function A_Star(start, goal, h)
    // The set of discovered nodes that may need to be (re-)expanded.
    // Initially, only the start node is known.
    // This is usually implemented as a min-heap or priority queue rather than a hash-set.
    openSet := {start}

    // For node n, cameFrom[n] is the node immediately preceding it on the cheapest path from start
    // to n currently known.
    cameFrom := an empty map

    // For node n, gScore[n] is the cost of the cheapest path from start to n currently known.
    gScore := map with default value of Infinity
    gScore[start] := 0

    // For node n, fScore[n] := gScore[n] + h(n). fScore[n] represents our current best guess as to
    // how cheap a path could be from start to finish if it goes through n.
    fScore := map with default value of Infinity
    fScore[start] := h(start)

    while openSet is not empty
        // This operation can occur in O(Log(N)) time if openSet is a min-heap or a priority queue
        current := the node in openSet having the lowest fScore[] value
        if current = goal
            return reconstruct_path(cameFrom, current)

        openSet.Remove(current)
        for each neighbor of current
            // d(current,neighbor) is the weight of the edge from current to neighbor
            // tentative_gScore is the distance from start to the neighbor through current
            tentative_gScore := gScore[current] + d(current, neighbor)
            if tentative_gScore < gScore[neighbor]
                // This path to neighbor is better than any previous one. Record it!
                cameFrom[neighbor] := current
                gScore[neighbor] := tentative_gScore
                fScore[neighbor] := tentative_gScore + h(neighbor)
                if neighbor not in openSet
                    openSet.add(neighbor)

    // Open set is empty but goal was never reached
    return failure

I was wondering if this algorithm is assuming that the heuristic function is admissible(never overestimates the actual cost to get to the goal) and consistent (h(x) ≤ d(x, y) + h(y))?

Because i found another pseudocode of A* that is more complex:

 function A*(start,goal)
 closedset := the empty set                 % The set of nodes already evaluated.     
 openset := set containing the initial node % The set of tentative nodes to be evaluated.
 g_score[start] := 0                        % Distance from start along optimal path.
 came_from := the empty map                 % The map of navigated nodes.
 h_score[start] := heuristic_estimate_of_distance(start, goal)
 f_score[start] := h_score[start]           % Estimated total distance from start to goal through y.
 while openset is not empty
     x := the node in openset having the lowest f_score[] value
     if x = goal
         return reconstruct_path(came_from,goal)
     remove x from openset
     add x to closedset
     foreach y in neighbor_nodes(x)
         if y in closedset
             continue
         tentative_g_score := g_score[x] + dist_between(x,y)
         
         if y not in openset
             add y to openset
            
             tentative_is_better := true
         elseif tentative_g_score < g_score[y]
             tentative_is_better := true
         else
             tentative_is_better := false
         if tentative_is_better = true
             came_from[y] := x
             g_score[y] := tentative_g_score
             h_score[y] := heuristic_estimate_of_distance(y, goal)
             f_score[y] := g_score[y] + h_score[y]
 return failure

 function reconstruct_path(came_from,current_node)
     if came_from[current_node] is set
         p = reconstruct_path(came_from,came_from[current_node])
         return (p + current_node)
     else
         return the empty path

Both algorithm seems working correctly with euclidean heuristic function on an undirected graph that has euclidean distance between nodes as weight. But is the second pseudocode more general? Does the first one takes for granted the admissibility and the consistency of the heuristic function?


Solution

  • Neither algorithm can ensure to find the shortest path if the heuristic function is not admissible. But you are right that the first algorithm uses the assumption of consistency, while the second one does not.

    This difference is expressed in the use of a closed set: the first algorithm does not maintain a closed set. The closed set of the second algorithm collects nodes for which the shortest path from the source to that node has been determined. The role of this closed set is to avoid that the algorithm would consider such a node as a next optimal target via a different path. This should never have success, as we already determined the shortest path to that node and it wasn't the current path.

    However, if edge weights could be negative, then there might be a cycle that keeps making the cost of a path less and less, just by running through that cycle repeatedly. The first algorithm would get stuck in such an endless loop, while the second wouldn't.

    Some other differences between the algorithms are not essential:

    • Only the first version uses a map that has infinity as default value: the second version does not have this default value, and so it has to check whether the neighbor is in the open set. If not, then this is the first visit to that neighbor and it should be made. The first version does not have to make this separate check, as in this case the best distance will be found to be infinity and so surely this first visit improves on that.

    • The second version stores the results of the heuristic function in an array: but it brings no benefit, because that value is only read from that array right after it is stored, so storing it is not necessary, and the first version demonstrates that.