I have created an undirected graph using networkx, I need to find a list of all the connected components.
connected_components = {}
def dfs(node):
global connected_components, G
if node not in connected_components:
connected_components[node] = set()
for next in G.adj[node]:
dfs(next)
connected_components[node] = connected_components[next]
connected_components[node].add(node)
for node_ in G:
dfs(node_)
connected_comp_as_tuples = map(tuple, connected_components.values())
unique_components = set(connected_comp_as_tuples)
CC=list(unique_components)
I've tried using this code but the result is not the same as the one given using the nx.connected_components() command. What am I doing wrong?
connected_components[node] = connected_components[next]
is executed also when next
is actually the "parent" from which the DFS call came. This way you can lose out of the set of nodes that were already collected in a set that were really "descendants" of the current node.
For instance, if you have a graph with 4 nodes, like this:
and the edges were added in an order such that G.adj
looks like this:
[
[2],
[2],
[3, 1, 0],
[2]
]
...then the DFS calls will first go via 0, to 2, to 3, then from 2 to 1. At that moment all is OK, and connected_components[2]
has {1, 3}
. But then the problem arises: from 2 we visit 0. But node 0 is still associated with an empty set, and so node 2 also gets associated with it, losing the connection with the component that has nodes 1 and 3.
You should just implement the standard algorithm, using a visited
marker:
def find_components(G):
visited = set()
def dfs(node):
if node not in visited:
visited.add(node)
yield node
for nxt in G.adj[node]:
yield from dfs(nxt)
for node in G:
if node not in visited:
yield tuple(dfs(node)) # a single component
connected_components = list(find_components(G))
print(connected_components)