Finding all the roots inside a DiGraph (networkx)

Assuming I have this code which creates a DiGraph:

dict = {1:['a1', 'a2', 'a3'], 2:['a4', 'a5','a7']}
graph = nx.from_dict_of_lists(dict)
digraph = nx.DiGraph(graph)

How can I find all the roots in this graph? expected output for this graph is [1,2]

If it's is somewhat more convienent to you, I have written the code inside a google colab notebook where you can see the graph, hope it helps.

EDIT: this is somehow related to this question the difference is that in that post there was an assumption of that the graph is connected so there is only one root; it's not the case in my example. Can I 'divide' my graph to connected sub-graphs and then search for a root in each one?

Solution

I think you're not quite generating the graph you have in mind with this example, as all existent connections have bidirectional edges. Probably you meant to generate the graph:

d = {1:['a1', 'a2', 'a3'], 2:['a4', 'a5','a7']}
G = nx.from_dict_of_lists(d, create_using=nx.DiGraph)
print(graph.edges())
# OutEdgeView([('1', 'a1'), ('1', 'a2'), ('1', 'a3'), ('2', 'a4'), ('2', 'a5'), ('2', 'a7')])

Which gives the two component graph:

plt.subplots(figsize=(12,6))
nx.draw(G, with_labels=True, node_color='lightblue', node_size=500)

In order to find the root nodes in each component, we first need to generate the induced subgraphs from the existent connected components, for that we have Graph.subgraph. Then, on each subgraph, the root node can be found by searching over all nodes, and keeping the one with an in_degree of 0.

roots = []
for component in nx.weakly_connected_components(G):
    G_sub = G.subgraph(component)
    roots.extend([n for n,d in G_sub.in_degree() if d==0])

Which gives:

print(roots)
# [1, 2]