My question is simple. I'm using networkx to build graphs from protein structures. I'm using subgraph_is_isomorphic() function but I need to allow only some amino acids to be considered equal. For example, LEU should be considered equal to ILE but not to TRP. How can I define a custom node_match function to do this?
In the following code, I need a node_match function that if used, G1 and G2 become isomorphic but not otherwise.
G1 = nx.Graph()
G1.add_node(1, label = 'ARG')
G1.add_node(2, label = 'LEU')
G1.add_edge(1, 2)
G2 = nx.Graph()
G2.add_node(1, label = 'ARG')
G2.add_node(2, label = 'ILE')
G2.add_edge(1, 2)
You have to define your custom node_match
function, and pass it as a parameter when you initialize the GraphMatcher
object.
from networkx.algorithms.isomorphism import GraphMatcher
amm = [
{'A', 'B', 'C'}, # A == B == C
{'E', 'F', 'G'}, # E == F == G
{'H', 'I'}, # H == I
]
def node_match(u, v):
return next((True for s in amm if u['label'] in s and v['label'] in s), False)
gm = GraphMatcher(G1, G2, node_match=node_match)
gm.subgraph_is_isomorphic()
In the amm
list you can create many sets, one for each group of amminoacids to be considered equal. In your example:
>>> amm = [
... {'LEU', 'ILE'},
... {'ARG'},
... {'TRP'},
... ]
>>> gm = GraphMatcher(G1, G2, node_match=node_match)
>>> gm.subgraph_is_isomorphic()
True
If you want to speed things up a bit you can do this:
amm = [
{'A', 'B', 'C'}, # A == B == C
{'E', 'F', 'G'}, # E == F == G
{'H', 'I'}, # H == I
]
match = {a: s for s in amm for a in s}
def node_match(u, v):
return v['label'] in match[u['label']]