Search code examples
pythongraphnetworkx

How to calculate overall distances from lowest root(s) of a directed graph with networkx


If you look at this DAG (directed acyclic graph):

I want to create a dict which maps the distance from the lowest node(s) to all others nodes which is similar to the x position (height) of each node from the bottom in the rendered graph. For that given graph it would be:

distance_nodes_map: {
  0: {'base-zero', 'base-one'}, 
  1: {'low-b', 'low-a', 'low-c'}, 
  3: {'high-x', 'high-z', 'high-y'}, 
  2: {'mid-r', 'mid-q', 'mid-p'}, 
  4: {'super'}
}

I wrote an algorithm which worked for that graph above but then I've tested another graph and it didn't work anymore. I tried some algorithms and functions like shortest path or descendants_at_distance but I don't think they are really helpful as an input to calculate the distances.

My algorithm doesn't work for instance for this graph:

https://gist.github.com/timaschew/3b08a07243fa6f43773014ef5e705c96

Here is gist which contains:

  • a python script which reads a YAML file, the dependency/graph structure and generates a HTML with a rendered mermaid graph (I've removed my algorithm to calculate the distances in a wrong way)
  • both graphs which are shown here, as a YAML file

Solution

  • You are looking for an algorithm that draws a layered graph. There are many different algorithms, and you should choose the one that best fit your needs (for example, have a look at the following paper A Technique for Drawing Directed Graphs by Gansner et al.).

    Many of those algorithms are already implemented in Graphviz (a very famous and powerful graph visualization software). Once you have installed it, it's pretty straightforward to compute the result you are looking for (G is your directed acyclic graph built using networkx.DiGraph):

    from networkx.drawing.nx_agraph import graphviz_layout
    
    def get_distance_nodes_map(G):
        pos = graphviz_layout(G, prog='dot')
        coor = sorted({y for k, (x, y) in pos.items()})
        kmap = dict(zip(coor, range(len(coor))))
        distance_nodes_map = {level: set() for level in kmap.values()}
        for k, (x, y) in pos.items():
            distance_nodes_map[kmap[y]].add(k)
        return distance_nodes_map
    

    Here are a couple of examples using data that you provided:

    >>> from networkx import DiGraph
    >>> from pprint import PrettyPrinter
    >>> pp = PrettyPrinter()
    >>> G1 = DiGraph()
    >>> G1.add_edges_from([('super', 'high-x'), ('high-x', 'mid-p'),
    ...                    ('mid-p', 'low-b'), ('mid-p', 'low-c'),
    ...                    ('low-c', 'base-zero'), ('low-c', 'base-one'),
    ...                    ('high-y', 'mid-p'), ('high-y', 'base-zero'),
    ...                    ('high-z', 'base-one'), ('high-z', 'mid-r'),
    ...                    ('high-z', 'mid-q'), ('mid-q', 'low-a'),
    ...                    ('low-a', 'base-one')])
    >>> pp.pprint(get_distance_nodes_map(G1))
    {0: {'base-one', 'base-zero'},
     1: {'low-a', 'low-b', 'low-c'},
     2: {'mid-p', 'mid-r', 'mid-q'},
     3: {'high-y', 'high-x', 'high-z'},
     4: {'super'}}
    >>> G2 = DiGraph()
    >>> G2.add_edges_from([('n10', 'n11'), ('n11', 'n12'), ('n12', 'n13'),
    ...                    ('n13', 'n14'), ('n20', 'n14'), ('n20', 'n21'),
    ...                    ('n21', 'n22'), ('n22', 'n23'), ('n30', 'n23'),
    ...                    ('n30', 'n31'), ('n31', 'n32')])
    >>> pp.pprint(get_distance_nodes_map(G2))
    {0: {'n32'},
     1: {'n31', 'n23'},
     2: {'n30', 'n22'},
     3: {'n21', 'n14'},
     4: {'n13', 'n20'},
     5: {'n12'},
     6: {'n11'},
     7: {'n10'}}