Search code examples
pythonnodesnetworkx

How to find a node with a shortest path of length equal to some number in networkx python?


I have a simple code to create a graph,G in networkx.

import networkx as nx
import matplotlib.pyplot as plt
%matplotlib notebook

G = nx.DiGraph()

G.add_edge(1,2); G.add_edge(1,4)
G.add_edge(3,1); G.add_edge(3,2)
G.add_edge(3,4); G.add_edge(2,3)
G.add_edge(4,3)

I want to find "which node in G is connected to the other nodes by a shortest path of length equal to the diameter of G ".

there are two of these combinations, [1,3] and [2,4], which can be found by nx.shortest_path(G, 1) and nx.shortest_path(G, 2),respectively.

Or, for example, if I use nx.shortest_path_length(G, source=2) then I get {2: 0, 3: 1, 1: 2, 4: 2}. so the length=2 is from node 2 to node 4, which is ok.

now, I'm trying to generalise it for all of the nodes to see if I can find the target nodes.

for node in G.nodes():
    target = [k for k,v in nx.shortest_path_length(G, node).items() if v == nx.diameter(G)]
    print(target)

and I get this odd result:

[3]
[1, 4]
[1, 2]
[]

Can anybody explain what this result means? as I'm trying to apply this method to solve a bigger problem.


Solution

  • For the graph you provided:

    G = nx.DiGraph()
    
    G.add_edge(1,2); G.add_edge(1,4)
    G.add_edge(3,1); G.add_edge(3,2)
    G.add_edge(3,4); G.add_edge(2,3)
    G.add_edge(4,3)
    

    The following:

    for node in G.nodes():
        target = [k for k,v in nx.shortest_path_length(G, node).items() if v == nx.diameter(G)]
        print(target)
    

    will print the target to which node is of distance equal to nx.diameter(G)

    I would advise not calculating the diameter inside the for loop since that can turn out quite expensive.

    In comparison, for a 200 node graph (nx.barabasi_albert_graph(200, 2, seed=1)) with the diameter calculation outside the for loop it takes ~74ms. The other option (with the diameter calculation inside the for loop) takes... well it's still running :´) but i'd say it will take waaay too long.

    Also, instead of just the targets print the start and end nodes for readability:

    diameter = nx.diameter(G)
    for node in G.nodes():
        start_end_nodes = [(node, k) for k,v in nx.shortest_path_length(G, node).items() if v == diameter]
        print(start_end_nodes)
    

    yielding:

    [(1, 3)]          # the path from 1 to 3 has lenght 2 = nx.diameter(G)
    [(2, 1), (2, 4)]  # the paths from 2 to 1 and 2 to 4 have lenght 2
    [(4, 1), (4, 2)]  # the paths from 4 to 1 and 4 to 2 have lenght 2
    []                # there is no path starting at 3 that has lenght 2