Given XML, I need to convert it to JSON and modify the JSON object.
<?xml version="1.0" standalone="yes"?>
<!--COUNTRIES is the root element-->
<WORLD>
<country name="A">
<event day="323" name="$abcd"> </event>
<event day="23" name="$aklm"> </event>
<neighbor name="B" direction="W" friend="T"></neighbor>
<neighbor name="B" direction="W"></neighbor>
<neighbor name="B" direction="W"></neighbor>
</country>
<country name="C">
<event day="825" name="$nmre"> </event>
<event day="329" name="$lpok"> </event>
<event day="145" name="$dswq"> </event>
<event day="256" name="$tyul"> </event>
<neighbor name="D" direction="N"/>
<neighbor name="B" direction="W" friend="T"/>
</country>
</WORLD>
I want to remove "event" element in the final output of JSON file, and "friend" attribute, which is present inside "WORLD"-> "country"-> "neighbor". I am using "xmltodict" library in Python and successfully able to convert XML to JSON, but could not able to remove these elements and attributes from JSON file.
Python Code:
import xmltodict, json
class XMLParser:
def __init__(self, xml_file_path):
self.xml_file_path = xml_file_path
if not self.xml_file_path:
raise ValueError("XML file path is not found./n")
with open (self.xml_file_path, 'r') as f:
self.xml_file = f.read()
def parse_xml_to_json(self):
xml_file = self.xml_file
json_data = xmltodict.parse(xml_file, attr_prefix='')
if 'event' in json_data['WORLD']['country']:
del json_data['WORLD']['country']['event']
return json.dumps(json_data, indent=4)
xml_file_path = "file_path"
xml_parser = XMLParser(xml_file_path)
json_object = xml_parser.parse_xml_to_json()
print(json_object)
Please suggest.
You can use a recursive function to remove the unwanted keys from the dictionary. Below is a function that checks each dictionary for a key, and removes it if found, then iterates through the values of each dict and the items of each list and does applies the function again.
def remove_key(d: dict, key: str):
if key in d:
d.pop(key)
for val in d.values():
if isinstance(val, list):
for item in val:
remove_key(item, key)
if isinstance(val, dict):
remove_key(val, key)
First, parse the input XML:
import xmltodict
import json
xmltext = """<?xml version="1.0" standalone="yes"?>
<!--COUNTRIES is the root element-->
<WORLD>
<country name="A">
<event day="323" name="$abcd"> </event>
<event day="23" name="$aklm"> </event>
<neighbor name="B" direction="W" friend="T"></neighbor>
<neighbor name="B" direction="W"></neighbor>
<neighbor name="B" direction="W"></neighbor>
</country>
<country name="C">
<event day="825" name="$nmre"> </event>
<event day="329" name="$lpok"> </event>
<event day="145" name="$dswq"> </event>
<event day="256" name="$tyul"> </event>
<neighbor name="D" direction="N"/>
<neighbor name="B" direction="W" friend="T"/>
</country>
</WORLD>"""
d = xmltodict(xmltext)
The value of d
is the following:
d
# d has this value:
{'WORLD': {'country': [{'@name': 'A',
'event': [{'@day': '323', '@name': '$abcd'},
{'@day': '23', '@name': '$aklm'}],
'neighbor': [{'@name': 'B', '@direction': 'W', '@friend': 'T'},
{'@name': 'B', '@direction': 'W'},
{'@name': 'B', '@direction': 'W'}]},
{'@name': 'C',
'event': [{'@day': '825', '@name': '$nmre'},
{'@day': '329', '@name': '$lpok'},
{'@day': '145', '@name': '$dswq'},
{'@day': '256', '@name': '$tyul'}],
'neighbor': [{'@name': 'D', '@direction': 'N'},
{'@name': 'B', '@direction': 'W', '@friend': 'T'}]}]}}
Applying the function to d
removes the unwanted keys:
remove_key(d, 'event')
remove_key(d, '@friend')
d
# d now has this value:
{'WORLD': {'country': [{'@name': 'A',
'neighbor': [{'@name': 'B', '@direction': 'W'},
{'@name': 'B', '@direction': 'W'},
{'@name': 'B', '@direction': 'W'}]},
{'@name': 'C',
'neighbor': [{'@name': 'D', '@direction': 'N'},
{'@name': 'B', '@direction': 'W'}]}]}}
Now you can export to JSON.
with open('output.json', 'w') as fp:
json.dump(d, fp, indent=4)