Search code examples
pythonnetworkx

Python Networkx - Remove specific node and related edges


I have a graphml file that contains nodes and edges. I am trying to remove every node where data['zone'] != 'gold'. This is what my graph look like:

<?xml version='1.0' encoding='utf-8'?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
  <key id="d4" for="node" attr.name="source" attr.type="string" />
  <key id="d3" for="node" attr.name="use_case" attr.type="string" />
  <key id="d2" for="node" attr.name="kind" attr.type="string" />
  <key id="d1" for="node" attr.name="zone" attr.type="string" />
  <key id="d0" for="node" attr.name="table" attr.type="string" />
  <graph edgedefault="directed">
    <node id="097373">
      <data key="d0">valid_outer_cases_not_attached_to_pallet</data>
      <data key="d1">gold</data>
      <data key="d2">adhoc</data>
      <data key="d3">low</data>
    </node>
    <node id="36372">
      <data key="d0">kpis</data>
      <data key="d1">gold</data>
      <data key="d2">adhoc</data>
      <data key="d3">low</data>
    </node>

parser.py

import networkx as nx
import matplotlib.pyplot as plt


input_graph = nx.read_graphml("graph.graphml")

for node, data in input_graph.nodes(data=True):
    if data['zone'] != 'gold':
        input_graph.remove_node(node)

error log: RuntimeError: dictionary changed size during iteration


Solution

  • you can flag nodes for removal, and then remove them:

    to_remove = []
    for node, data in input_graph.nodes(data=True):
        if data['zone'] != 'gold':
            to_remove.append(node)
    
    input_graph.remove_nodes_from(to_remove)