Search code examples
pythondepth-first-search

DFS to find all possible path is very slow


I wrote DFS-like algorithm to find all possible paths starting from zero level. With 2,000 nodes and 5,000 edges, below code execution is extremely slow. Any suggestion for this algorithm?

    all_path = []

    def printAllPathsUntil(s, path):
        path.append(s)
        if s not in adj or len(adj[s]) <= 0:
            all_path.append(path[:]) # EDIT2
        else:
            for i in adj[s]:
                printAllPathsUntil(i, path)
        path.pop()

    for point in points_in_start:
        path = []
        printAllPathsUntil(point, path)

And the adj holds edges; start position as key and destination list as value.

    points_in_start = [0, 3, 7]
    adj = {0: [1, 8],
           1: [2, 5],
           2: [],
           3: [2, 4],
           4: [],
           5: [6],
           6: [],
           7: [6],
           8: [2]
           }

EDIT1

  • This is a DAG. No cycles.

enter image description here


Solution

  • The problem with your algorithm is that it will do a lot of repeated work. This is not the case in your example, as the only time when a node is reached by two other nodes, it is a leaf node, like C, but imaging an edge from D to B: That would mean that the entire sub-graph starting at B is visited again! For a graph with 2000 nodes, this will result in a significant slow-down.

    To counter this, you can use memoization, but this means that you have to restructure your algorithm to instead of adding to the existing path and then adding that path to all_paths, it has to return the (partial) paths starting at the current node and combine those to the full paths with the parent node. You can then use functools.lru_cache to re-use all those partial results when you visit B again coming from another node.

    from functools import lru_cache
    
    @lru_cache(None)
    def getAllPathsUntil(s):
        if s not in adj or not adj[s]:
            return [ [s] ]
        else:
            return [ [s, *p] for a in adj[s]
                             for p in getAllPathsUntil(a)]
    
    all_paths = []
    for point in points_in_start:
        all_paths.extend(getAllPathsUntil(point))