Search code examples
algorithmgraphgraph-algorithmpath-finding

Finding cheapest path on a graph, cost determined by max-weight of used nodes


I have a graph G with a starting node S and an ending node E. What's special with this graph is that instead of edges having costs, here it's the nodes that have a cost. I want to find the way (a set of nodes, W) between S and E, so that max(W) is minimized. (In reality, I am not interested of W, just max(W)) Equivalently, if I remove all nodes with cost larger than k, what's the smallest k so that S and E are still connected?

I have one idea, but want to know if it is correct and optimal. Here's my current pseudocode:

L := Priority Queue of nodes (minimum on top)
L.add(S, S.weight)

while (!L.empty) {
    X = L.poll()
    return X.weight if (X == G)
    mark X visited
    foreach (unvisited neighbour N of X, N not in L) {
        N.weight = max(N.weight, X.weight)
        L.add(N, N.weight)
    }
}

I believe it is worst case O(n log n) where n is the number of nodes.

Here are some details for my specific problem (percolation), but I am also interested of algorithms for this problem in general. Node weights are randomly uniformly distributed between 0 and a given max value. My nodes are Poisson distributed on the R²-plane, and an edge between two nodes exists if the distance between two nodes is less than a given constant. There are potentially very many nodes, so they are generated on the fly (hidden in the foreach in the pseudocode). My starting node is in (0,0) and the ending node is any node on a distance larger than R from (0,0).

EDIT: The weights on the nodes are floating point numbers.


Solution

  • Starting from an empty graph, you can insert vertices (and their edges to existing neighbours) one at a time in increasing weight order, using a fast union/find data structure to maintain the set of connected components. This is just like the Kruskal algorithm for building minimum spanning trees, but instead of adding edges one at a time, for each vertex v that you process, you would combine the components of all of v's neighbours.

    You also keep track of which two components contain the start and end vertices. (Initially comp(S) = S and comp(E) = E; before each union operation, the two input components X and Y can be checked to see whether either one is either comp(S) or comp(E), and the latter updated accordingly in O(1) time.) As soon as these two components become a single component (i.e. comp(S) = comp(E)), you stop. The vertex just added is the maximum weight vertex on the the path between S and E that minimises the maximum weight of any vertex.

    [EDIT: Added time complexity info]

    If the graph contains n vertices and m edges, it will take O(n log n) time to sort the vertices by weight. There will be at most m union operations (since every edge could be used to combine two components). If a simple disjoint set data structure is used, all of these union operations could be done in O(m + n log n) time, and this would become the overall time complexity; if path compression is also used, this drops to O(m A(n)), where A(n) is the incredibly slowly growing inverse Ackermann function, but the overall time complexity remains unchanged from before because the initial sorting dominates.

    Assuming integer weights, Pham Trung's binary search approach will take O((n + m) log maxW) time, where maxW is the heaviest vertex in the graph. On sparse graphs (where m = O(n)), this becomes O(n log maxW), while mine becomes O(n log n), so here his algorithm will beat mine if log(maxW) << log(n) (i.e. if all weights are very small). If his algorithm is called on a graph with large weights but only a small number of distinct weights, then one possible optimisation would be to sort the weights in O(n log n) time and then replace them all with their ranks in the sorted order.