So I was recently studying this course on Probabilistic Graphical Models and I wanted to try a hands-on example. In the example, I want to loop over all possible combinations (29,281) acyclic digraphs (or DAGs) and see how they fit the data.
I know that the number of all possible graphs is given by
from scipy.misc import comb
import numpy as np
def a(n):
if n == 0:
return 1
else:
sum = 0
for k in range(1,n+1):
sum += np.power(-1,k+1) * \
comb(n,k) * \
np.power(2,k*(n-k)) * \
a(n-k)
return sum
This gives us the series A003024
But I'd like to find the algorithm to be able to loop over all these possible graphs and not just count them.
I know there is some code available for undirected graphs, but I couldn't get them to work for me.
I'm open to any form of representation of the graph, be it a library, a custom function or a list of lists.
Example- when you have two labels, there are 3 possible graphs:
[[A:{}], [B:{}]] # A B no connection P(A,B) = P(A)P(B)
[[A:{B}], [B:{}]] # A -> B P(A,B) = P(A)P(B|A)
[[A:{}], [B:{A}]] # A <- B P(A,B) = P(B)P(A|B)
Since you want 29,281 resulting graphs, labelling must be important for you (IOW, you're not modding out by isomorphism.) Using a brute-force approach in networkx:
from itertools import combinations, product
import networkx as nx
def gen_dag(num_nodes):
all_edges = list(combinations(range(num_nodes), 2))
for p in product([None, 1, -1], repeat=len(all_edges)):
G = nx.DiGraph()
G.add_nodes_from(range(num_nodes))
edges = [edge[::edge_dir] for edge, edge_dir in zip(all_edges, p) if edge_dir]
G.add_edges_from(edges)
if nx.is_directed_acyclic_graph(G):
yield G
which gives
In [75]: graphs = list(gen_dag(5))
In [76]: len(graphs)
Out[76]: 29281
and (for example)
In [79]: graphs[1234].edges()
Out[79]: OutEdgeView([(3, 1), (3, 4), (4, 0), (4, 2)])
In [80]: nx.adjacency_matrix(graphs[1234]).todense()
Out[80]:
matrix([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 1, 0, 0, 1],
[1, 0, 1, 0, 0]], dtype=int64)