Search code examples
pythonrecursionpathlongest-path

Recursive function to find all paths to a specific target


I need to find the longest path to a given target. The data is a dictionary of ids with the value being a list of all ids that point to that id. Also worth noting that each Id can only point to one other id.

I tried to write a recursive function that will go through each possible path, and store each unique path option to another list, from which I will find the longest path.

def create(main, vec, id):
    if (id not in INFO):
        return(main, vec, id)
    else:
        for source in INFO[id]:
            vec.append(source)
            main.append(vec)
            main, vec, id = create(main, vec, source)
        return main,vec,id

and longest function

def longest(main):
    longest = 0
    long_list = 0
    for list in main:
        if len(list) > longest:
            long_list = list
            longest = len(list)
    return long_list

when doing

INFO = {
    'D4': ['B2','B6'],
    'B6': ['D3'],
    'D3': ['F1','A2'],
    'A2': ['G8'],
    'A1': ['C3'],
    'B2': ['E3','A1']}
main, vec, id = create([],[],'D4')
print(longest(main))

I get main to have paths that stack on top of eachother. How would I fix the code so the paths don't stack. I hope to get main to look something like

[['B2'],
['B2','E3'],
['B2','A1'],
['B2','A1','C3'],
['B6'],
['B6','D3'],
['B6','D3','F1'],
['B6','D3','A2'],
['B6','D3','A2','G8']]

EDIT:

Changed line main, vec, id = create(main,[],'D4') to main, vec, id = create([],[],'D4') to clarify that main is a list of lists.


Solution

  • Due to the recursive property of your approach it is hard to keep track of the path between the current node and the starting node (root).

    However, when you are only interested in the longest path, it certainly links the root and a leave (these are the nodes which do not have a link to any other node).

    In the following code, when reaching one of the leaves, i.e. when currentNode not in INFO is true, I travel the way up and recording the path till reaching the root.

    def create(pathListRef, currentNode, rootNode):
        # pathListRef is the reference to the list containing the paths between all leaves and the root.
        # currentNode holds the current node, e.g. "D3"
        # rootNode holds reference to the root node, i.e. "D4"
    
        if (currentNode not in INFO):
            # We have reached on of the leaves
    
            reverseNode = currentNode
            # reverseNode is used to keep track of at which node we are when traveling back to the root.
    
            path = []
            # holds the relative path between the leave and reverseNode
    
            while reverseNode is not rootNode:
                # As long as we have not reached the root
    
                path.insert(0, reverseNode)
                # Prepend (like append, but at the start of the list) the reverseNode to the relative path
    
                reverseNode = list(INFO.keys())[[i for i,x in enumerate(list(INFO.values())) if reverseNode in x][0]]
                # Get the node linked with the reverseNode and move forward to that one. In essence get the key, where the value contains (since it is a list) the reverseNode.
    
            pathListRef.append(path)
            # We are at the root, so add the relative path to the final paths list
            return
    
        # This is only executed when we are not at the leave yet (since we return when we are at a leave)
        for source in INFO[currentNode]:
            create(pathListRef, source, rootNode)
    

    It is used as follows:

    myList = []
    startNode = "D4"
    create(myList, startNode, startNode)
    print(myList)  # -> [['B2', 'E3'], ['B2', 'A1', 'C3'], ['B6', 'D3', 'F1'], ['B6', 'D3', 'A2', 'G8']]
    

    And by using your longest() function you get:

    print(longest(myList))  # -> ['B6', 'D3', 'A2', 'G8']
    

    Btw, it is possible to shorten your longest() function to the following one. Furthermore, this code also returns multiple paths if there is not only one longest.

    def longest(main):
        return [x for x in main if len(x) == max([len(x) for x in main])]