Search code examples
pythonnetworkx

Networkx: extract the connected component containing a given node (directed graph)


I am trying to extract from a big graph the sub-graph of all connected nodes containing a specific node.

Is there a solution in the Networkx library?

[EDIT]
My graph is a DiGraph

[EDIT]
Rephrased simply:
I want the part of my graph that contain my specific node N_i and and all the nodes that are connected directly or indirectly (passing by other nodes) using any incoming or outcoming edges.
Example:

>>> g = nx.DiGraph()
>>> g.add_path(['A','B','C',])
>>> g.add_path(['X','Y','Z',])
>>> g.edges()
[('A', 'B'), ('B', 'C'), ('Y', 'Z'), ('X', 'Y')]

My desired result would be:

>>> g2 = getSubGraph(g, 'B')
>>> g2.nodes()
['A', 'B', 'C']
>>> g2.edges()
[('A', 'B'), ('B', 'C')]

Solution

  • You can use shortest_path() to find all of the nodes reachable from a given node. In your case you need to first convert the graph to an undirected representation so both in- and out-edges are followed.

    In [1]: import networkx as nx
    
    In [2]: >>> g = nx.DiGraph()
    
    In [3]: >>> g.add_path(['A','B','C',])
    
    In [4]: >>> g.add_path(['X','Y','Z',])
    
    In [5]: u = g.to_undirected()
    
    In [6]: nodes = nx.shortest_path(u,'B').keys()
    
    In [7]: nodes
    Out[7]: ['A', 'C', 'B']
    
    In [8]: s = g.subgraph(nodes)
    
    In [9]: s.edges()
    Out[9]: [('A', 'B'), ('B', 'C')]
    

    Or in one line

    In [10]: s = g.subgraph(nx.shortest_path(g.to_undirected(),'B'))
    
    In [11]: s.edges()
    Out[11]: [('A', 'B'), ('B', 'C')]