Search code examples
pythonperformancenetworkxparentdirected-graph

Most elegant way to find node's predecessors with networkX


I'm working on a graphical model project with python using NetworkX. NetworkX provides simple and good functionality using dictionaries:

import networkx as nx
G = nx.DiGraph() # a directed graph
G.add_edge('a', 'b')
print G['a'] # prints {'b': {}}
print G['b'] # prints {}

I want to use directed graphs because I am coding dependencies that have directions (in the above example I have the closed form for 'b' conditional on 'a', not the other way around).

For a given node, I want to find the predecessors of that node. For the above example, par('b') should return ['a']. NetworkX does have a successor function, which finds the children of any node. Obviously, by going through all the nodes and finding those that have 'b' as a child will work, but it will be Ω(n) in the number of nodes (which will be too expensive for my application).

I cannot imagine that something this simple would be left out of this well-made package, but can't find anything.

One efficient option is to store a directed and an undirected version of the graph; all undirected edges are essentially implemented by adding both directed edges, and so it would be possible to take the set-wise difference between the adjacent nodes and the children (which would be the predecessor).

The trouble is I'm not sure of the most pythonic way to wrap the existing networkx DiGraph and Graph class to accomplish this. Really I just want to end up with a class PGraph that behaves exactly like the networkx DiGraph class but has a predecessors(node) function in addition to the successors(node) function.

Should PGraph inherit from DiGraph and encapsulate Graph (for use in the predecessors function)? How then should I force all nodes and edges to be added to both the directed and undirected graphs that it contains? Should I just reimplement the functions for adding and removing nodes and edges in PGraph (so that they are added and removed from both the directed and undirected version)? I worry that if I miss something obscure I'll be in for a headache later, which may not imply good design.

Or (and please let this be True) is there simply an easy way to get a node's predecessors in a networkx.DiGraph and I've completely missed it?

Thanks a lot for your help.


EDIT:

I think this does the job. PGraph inherits from DiGraph and encapsulates another DiGraph (this one reversed). I've overridden the methods to add & remove nodes & edges.

import networkx as nx

class PGraph(nx.DiGraph):
    def __init__(self):
        nx.DiGraph.__init__(self)
        self.reversed_graph = nx.DiGraph()
    def add_node(self, n, attr_dict=None, **attr):
        nx.DiGraph.add_node(self, n, attr_dict, **attr)
        self.reversed_graph.add_node(n, attr_dict, **attr)
    def add_nodes_from(self, ns, attr_dict=None, **attr):
        nx.DiGraph.add_nodes_from(self, ns, attr_dict, **attr)
        self.reversed_graph.add_nodes_from(ns, attr_dict, **attr)
    def add_edge(self, a, b, attr_dict=None, **attr):
        nx.DiGraph.add_edge(self, a, b, attr_dict, **attr)
        self.reversed_graph.add_edge(b, a, attr_dict, **attr)
    def add_edges_from(self, es, attr_dict=None, **attr):
        nx.DiGraph.add_edges_from(self, es, attr_dict, **attr)
        self.reversed_graph.add_edges_from(es, attr_dict, **attr)
    def remove_node(self, n):
        nx.DiGraph.remove_node(self, n)
        self.reversed_graph.remove_node(n)
    def remove_nodes_from(self, ns):
        nx.DiGraph.remove_nodes_from(self, ns)
        self.reversed_graph.remove_nodes_from(ns)
    def remove_edge(self, a, b):
        nx.DiGraph.remove_edge(self, b, a)
        self.reversed_graph.remove_edge(a, b)
    def remove_edges_from(self, es):
        nx.DiGraph.remove_edges_from(self, es)
        self.reversed_graph.remove_edges_from([ (b,a) for a,b in es])
# the predecessors function I wanted
    def predecessors(self, n):
        return self.reversed_graph.successors(n)

What do you think about this solution? It might double the memory usage, but I think that's acceptable. Is it too complicated? Is this good design?


Solution

  • There is a predecessor (and predecessor_iter in older versions) method on DiGraphs

    Also there is nothing stopping you from accessing the data structure directly as G.pred

     In [1]: import networkx as nx
     In [2]: G = nx.DiGraph() # a directed graph
     In [3]: G.add_edge('a', 'b')
     In [4]: list(G.predecessors('b'))
     Out[4]: ['a']
     In [5]: G.pred['b']
     Out[5]: {'a': {}}