Search code examples
pythonalgorithmperformancenetworkxgraph-theory

Generate all digraphs of a given size up to isomorphism


I am trying to generate all directed graphs with a given number of nodes up to graph isomorphism so that I can feed them into another Python program. Here is a naive reference implementation using NetworkX, I would like to speed it up:

from itertools import combinations, product
import networkx as nx

def generate_digraphs(n):
  graphs_so_far = list()
  nodes = list(range(n))
  possible_edges = [(i, j) for i, j in product(nodes, nodes) if i != j]
  for edge_mask in product([True, False], repeat=len(possible_edges)):
    edges = [edge for include, edge in zip(edge_mask, possible_edges) if include]
    g = nx.DiGraph()
    g.add_nodes_from(nodes)
    g.add_edges_from(edges)
    if not any(nx.is_isomorphic(g_before, g) for g_before in graphs_so_far):
      graphs_so_far.append(g)
  return graphs_so_far

assert len(generate_digraphs(1)) == 1
assert len(generate_digraphs(2)) == 3
assert len(generate_digraphs(3)) == 16

The number of such graphs seems to grow pretty quickly and is given by this OEIS sequence. I am looking for a solution that is able to generate all graphs up to 7 nodes (about a billion graphs in total) in a reasonable amount of time.

Representing a graph as a NetworkX object is not very important; for example, representing a graph with an adjacency list or using a different library is good with me.


Solution

  • There’s a useful idea that I learned from Brendan McKay’s paper “Isomorph-free exhaustive generation” (though I believe that it predates that paper).

    The idea is that we can organize the isomorphism classes into a tree, where the singleton class with the empty graph is the root, and each class with graphs having n > 0 nodes has a parent class with graphs having n − 1 nodes. To enumerate the isomorphism classes of graphs with n > 0 nodes, enumerate the isomorphism classes of graphs with n − 1 nodes, and for each such class, extend its representatives in all possible ways to n nodes and filter out the ones that aren’t actually children.

    The Python code below implements this idea with a rudimentary but nontrivial graph isomorphism subroutine. It takes a few minutes for n = 6 and (estimating here) on the order of a few days for n = 7. For extra speed, port it to C++ and maybe find better algorithms for handling the permutation groups (maybe in TAoCP, though most of the graphs have no symmetry, so it’s not clear how big the benefit would be).

    import cProfile
    import collections
    import itertools
    import random
    
    
    # Returns labels approximating the orbits of graph. Two nodes in the same orbit
    # have the same label, but two nodes in different orbits don't necessarily have
    # different labels.
    def invariant_labels(graph, n):
        labels = [1] * n
        for r in range(2):
            incoming = [0] * n
            outgoing = [0] * n
            for i, j in graph:
                incoming[j] += labels[i]
                outgoing[i] += labels[j]
            for i in range(n):
                labels[i] = hash((incoming[i], outgoing[i]))
        return labels
    
    
    # Returns the inverse of perm.
    def inverse_permutation(perm):
        n = len(perm)
        inverse = [None] * n
        for i in range(n):
            inverse[perm[i]] = i
        return inverse
    
    
    # Returns the permutation that sorts by label.
    def label_sorting_permutation(labels):
        n = len(labels)
        return inverse_permutation(sorted(range(n), key=lambda i: labels[i]))
    
    
    # Returns the graph where node i becomes perm[i] .
    def permuted_graph(perm, graph):
        perm_graph = [(perm[i], perm[j]) for (i, j) in graph]
        perm_graph.sort()
        return perm_graph
    
    
    # Yields each permutation generated by swaps of two consecutive nodes with the
    # same label.
    def label_stabilizer(labels):
        n = len(labels)
        factors = (
            itertools.permutations(block)
            for (_, block) in itertools.groupby(range(n), key=lambda i: labels[i])
        )
        for subperms in itertools.product(*factors):
            yield [i for subperm in subperms for i in subperm]
    
    
    # Returns the canonical labeled graph isomorphic to graph.
    def canonical_graph(graph, n):
        labels = invariant_labels(graph, n)
        sorting_perm = label_sorting_permutation(labels)
        graph = permuted_graph(sorting_perm, graph)
        labels.sort()
        return max(
            (permuted_graph(perm, graph), perm[sorting_perm[n - 1]])
            for perm in label_stabilizer(labels)
        )
    
    
    # Returns the list of permutations that stabilize graph.
    def graph_stabilizer(graph, n):
        return [
            perm
            for perm in label_stabilizer(invariant_labels(graph, n))
            if permuted_graph(perm, graph) == graph
        ]
    
    
    # Yields the subsets of range(n) .
    def power_set(n):
        for r in range(n + 1):
            for s in itertools.combinations(range(n), r):
                yield list(s)
    
    
    # Returns the set where i becomes perm[i] .
    def permuted_set(perm, s):
        perm_s = [perm[i] for i in s]
        perm_s.sort()
        return perm_s
    
    
    # If s is canonical, returns the list of permutations in group that stabilize s.
    # Otherwise, returns None.
    def set_stabilizer(s, group):
        stabilizer = []
        for perm in group:
            perm_s = permuted_set(perm, s)
            if perm_s < s:
                return None
            if perm_s == s:
                stabilizer.append(perm)
        return stabilizer
    
    
    # Yields one representative of each isomorphism class.
    def enumerate_graphs(n):
        assert 0 <= n
        if 0 == n:
            yield []
            return
        for subgraph in enumerate_graphs(n - 1):
            sub_stab = graph_stabilizer(subgraph, n - 1)
            for incoming in power_set(n - 1):
                in_stab = set_stabilizer(incoming, sub_stab)
                if not in_stab:
                    continue
                for outgoing in power_set(n - 1):
                    out_stab = set_stabilizer(outgoing, in_stab)
                    if not out_stab:
                        continue
                    graph, i_star = canonical_graph(
                        subgraph
                        + [(i, n - 1) for i in incoming]
                        + [(n - 1, j) for j in outgoing],
                        n,
                    )
                    if i_star == n - 1:
                        yield graph
    
    
    def test():
        print(sum(1 for graph in enumerate_graphs(5)))
    
    
    cProfile.run("test()")