Search code examples
pythondictionarygraphlongest-path

finding longest path in a graph


I am trying to solve a program, where in I have to find the max number of cities connected for a given list of routes.

for eg: if the given route is [['1', '2'], ['2', '4'], ['1', '11'], ['4', '11']] then max cities connected will be 4 constraint is I can't visit a city which I already have visited.

I need ideas, as in how to progress.

For now, What I have thought is if I could be able to create a dictionary with cities as a key and how many other cities its connected to as its value, i get somewhere near to the solution(I hope). for eg: My dictionary will be {'1': ['2', '11'], '4': ['11'], '2': ['4']} for the above given input. I want help to proceed further and guidance if I am missing anything.


Solution

  • You can use a defaultdict to create your "Graph" from your list of edges/paths:

    edges = [['1', '2'], ['2', '4'], ['1', '11'], ['4', '11']]
    
    G = defaultdict(list)
    for (s,t) in edges:
        G[s].append(t)
        G[t].append(s)
    
    print G.items()
    

    Output:

    [
      ('1', ['2', '11']), 
      ('11', ['1', '4']), 
      ('2', ['1', '4']), 
      ('4', ['2', '11'])
    ]
    

    Note that I added the edges in both directions, since you're working with an undirected graph. So with the edge (a,b), G[a] will include b and G[b] will include a.

    From this, you can use an algorithm like depth-first search or breadth-first search to discover all the paths in the graph.

    In the following code, I used DFS:

    def DFS(G,v,seen=None,path=None):
        if seen is None: seen = []
        if path is None: path = [v]
    
        seen.append(v)
    
        paths = []
        for t in G[v]:
            if t not in seen:
                t_path = path + [t]
                paths.append(tuple(t_path))
                paths.extend(DFS(G, t, seen[:], t_path))
        return paths
    

    Which you can use with:

    G = defaultdict(list)
    for (s,t) in edges:
        G[s].append(t)
        G[t].append(s)
    
    print DFS(G, '1')
    

    Output:

    [('1', '2'), ('1', '2', '4'), ('1', '2', '4', '11'), ('1', '11'), ('1', '11', '4'), ('1', '11', '4', '2')]
    

    So the full code, with the final bit that shows the longest path:

    from collections import defaultdict
    
    def DFS(G,v,seen=None,path=None):
        if seen is None: seen = []
        if path is None: path = [v]
    
        seen.append(v)
    
        paths = []
        for t in G[v]:
            if t not in seen:
                t_path = path + [t]
                paths.append(tuple(t_path))
                paths.extend(DFS(G, t, seen[:], t_path))
        return paths
    
    
    # Define graph by edges
    edges = [['1', '2'], ['2', '4'], ['1', '11'], ['4', '11']]
    
    # Build graph dictionary
    G = defaultdict(list)
    for (s,t) in edges:
        G[s].append(t)
        G[t].append(s)
    
    # Run DFS, compute metrics
    all_paths = DFS(G, '1')
    max_len   = max(len(p) for p in all_paths)
    max_paths = [p for p in all_paths if len(p) == max_len]
    
    # Output
    print("All Paths:")
    print(all_paths)
    print("Longest Paths:")
    for p in max_paths: print("  ", p)
    print("Longest Path Length:")
    print(max_len)
    

    Output:

    All Paths:
       [('1', '2'), ('1', '2', '4'), ('1', '2', '4', '11'), ('1', '11'), ('1', '11', '4'), ('1', '11', '4', '2')]
    Longest Paths:
       ('1', '2', '4', '11')
       ('1', '11', '4', '2')
    Longest Path Length:
       4
    

    Note, the "starting point" of your search is specified by the second argument to the DFS function, in this case, it's '1'.


    Update: As discussed in the comments the above code assumes you have a starting point in mind (specifically the code uses the node labelled '1').

    A more general method, in the case that you have no such starting point, would be to perform the search starting at every node, and take the overall longest. (Note: In reality, you could be smarter than this)

    Changing the line

    all_paths = DFS(G, '1')
    

    to

    all_paths = [p for ps in [DFS(G, n) for n in set(G)] for p in ps]
    

    would give you the longest path between any two points.

    (This is a silly list comprehension, but it allows me to update only a single line. Put more clearly, it's equivalent to the following:

    all_paths = []
    for node in set(G.keys()):
        for path in DFS(G, node):
            all_paths.append(path)
    

    or

    from itertools import chain
    all_paths = list(chain.from_iterable(DFS(G, n) for n in set(G)))
    

    ).