Search code examples
pythongraphnetworkx

Networkx - retrieve fully connected components from graph


Is it possible to retrieve fully connected components from a graph where 'every' node within a component are connected to 'all' of the other component nodes?

I have tried the following:

import networkx as nx

edges = [
    ('a', 'b'),
    ('a', 'c'),
    ('b', 'c'),
    ('a', 'd'),
    ('f', 'g')
    ]

graph = nx.Graph(edges)
result = list(nx.connected_components(graph))
print(result)

The results are as follows: [{'b', 'c', 'a', 'd'}, {'f', 'g'}]

The issue I have is that the first component includes the node label d, even though it does not have any edges with every other node in the first component.

The results that I would like are: [{'b', 'c', 'a'}, {'f', 'g'}]


Solution

  • It looks like you might want to find the cliques, which is achieved with find_cliques:

    out = list(nx.find_cliques(graph))
    

    Output: [['f', 'g'], ['a', 'd'], ['a', 'b', 'c']]

    Now, you can post-process this output to form non overlapping groups. Since removing one or more items from a clique still yields a clique, you can sort by decreasing size and remove the already seen nodes:

    seen = set()
    out = []
    for c in map(set, sorted(nx.find_cliques(graph), key=len, reverse=True)):
        c -= seen
        seen.update(c)
        if c:
            out.append(c)
    

    Output: [{'a', 'b', 'c'}, {'f', 'g'}, {'d'}]

    Note that this doesn't always yield the possibility with cliques of a maximum size.

    For instance with:

    enter image description here

    This yields:

    # before filtering
    [['g', 'f'], ['c', 'd', 'a'], ['c', 'd', 'h'], ['c', 'b', 'a']]
    
    # after filtering
    [{'a', 'c', 'd'}, {'h'}, {'b'}, {'f', 'g'}]