Search code examples
c++algorithmperformancegraph-theorydepth-first-search

How to calculate the total distance between various vertices in a graph?


Let's say I have a weighted, undirected, acyclic graph with no negative value weights, comprised of n vertices and n-1 edges. If I want to calculate the total distance between every single one of them (using edge weight) and then add it up, which algorithm should I use? If for example a graph has 4 vertices, connected like a-b, a-c, c-d then the program should output the total distance needed to go from a-d, a-c, a-b, b-c, b-d and so on. You could call it every possible path between the given vertices. The language I am using is C++.

I have tried using Dijikstra's and Prim's algorithm, but none have worked for me. I have thought about using normal or multisource DFS, but I have been struggling with it for some time now. Is there really a fast way to calculate it, or have I misunderstood the problem entirely?


Solution

  • Since you have an acyclic graph, there is only one possible path between any two points. This makes things a lot simpler to compute and you don't need to use any real pathfinding algorithms.

    Let's say we have an edge E that connects nodes A and B. Calculate how many nodes can be reached from node A, not using edge E (including A). Multiply that by the number of nodes that can be reached from node B, not using edge E (including B). Now you have the number of paths that travel through edge E. Multiply this by the weight of edge E, and you have the total contribution of edge E to the sum.

    Do the same thing for every edge and add up the results.

    To make the algorithm more efficient, each edge can store cached values that say the number of nodes that are reachable on each side of the edge.

    You don't have to use a depth first search. Here is some pseudocode showing how you calculate the number of nodes reachable on a side of edge E very fast taking advantage of caching:

    int count_nodes_reachable_on_edge_side(Edge e, Node a) {
      // assume edge e directly connects to node a
      if (answer is already cached in e) { return the answer; }
      
      answer = 1;  // a is reachable
      for each edge f connected to node a {
        if (f is not e) {
          let b be other node f touches (not a)
          answer += count_nodes_reachable_on_edge_side(f, b)
        }
      }
    
      cache the answer in edge e;
      return answer;
    }