I'm creating a dictionary from a xml file as follows:
for edge in root.findall('n:graph/n:edge', ns):
source = edge.get('source')
target = edge.get('target')
edges[(source, target)] = tuple([data.text for data in edge if \
(data.get('key') == keys[0] or data.get('key') == keys[1])])
Which gives me an output like this one:
{ ( '4893468839', '977369380' ) : ( 'name', ' length') ... }
Is there a way I can put a default text 'noName' when the values for the field name is empty? All the keys have a length value but not all of then have a name one so I want to avoid an output like:
{ ( '4893468839', '977369380' ) : ( ' length' , ) ... }
To get something like this in that case:
{ ( '4893468839', '977369380' ) : ( 'noName' , ' length' ) ... }
More detailed information:
from lxml import etree
class graph():
_path = ""
def _readFile(self):
data = etree.parse(self._path)
root = data.getroot()
for edge in root.findall('n:graph/n:edge', ns):
source = edge.get('source')
target = edge.get('target')
edges[(source, target)] = tuple([data.text for data in edge if data.get('key') in keys[:2]])
Given a piece of xml like the following:
<key attr.name="ref" attr.type="string" for="edge" id="d14" />
<key attr.name="name" attr.type="string" for="edge" id="d13" />
<key attr.name="geometry" attr.type="string" for="edge" id="d12" />
<key attr.name="length" attr.type="string" for="edge" id="d11" />
<key attr.name="oneway" attr.type="string" for="edge" id="d10" />
<key attr.name="highway" attr.type="string" for="edge" id="d9" />
<key attr.name="bridge" attr.type="string" for="edge" id="d8" />
<key attr.name="osmid" attr.type="string" for="edge" id="d7" />
<edge id="0" source="4331489627" target="4331489577">
<data key="d7">435211336</data>
<data key="d13">Calle Carretera</data>
<data key="d9">residential</data>
<data key="d10">False</data>
<data key="d11">52.45</data>
<data key="d12">LINESTRING (-4.8413613 39.4799045, -4.8414814 39.4798489, -4.8419449 39.4797838)</data>
</edge>
Will generate the following output im okay with:
{ ( '4331489627', '4331489577' ) : ( 'Calle Carretera', ' 52.45') }
But for example there are some edges, mising the name or d13 key tag like this one:
<edge id="0" source="982621562" target="946409159">
<data key="d7">483537143</data>
<data key="d14">CM-4106</data>
<data key="d9">secondary</data>
<data key="d10">False</data>
<data key="d11">104.66499999999999</data>
<data key="d12">LINESTRING (-4.8366071 39.4783468, -4.8368979 39.4789602, -4.8371678 39.4791592)</data>
</edge>
In those cases, im getting this output since the tag text is not found:
{ ( '982621562', '946409159' ) : (' 52.45', ) }
And if possible, would want to get something like:
{ ( '982621562', '946409159' ) : ( 'noName', ' 52.45') }
based on the above, I've put together an example that actually works:
from lxml import etree
root = etree.fromstring("""
<xml><graph>
<key attr.name="ref" attr.type="string" for="edge" id="d14" />
<key attr.name="name" attr.type="string" for="edge" id="d13" />
<key attr.name="geometry" attr.type="string" for="edge" id="d12" />
<key attr.name="length" attr.type="string" for="edge" id="d11" />
<key attr.name="oneway" attr.type="string" for="edge" id="d10" />
<key attr.name="highway" attr.type="string" for="edge" id="d9" />
<key attr.name="bridge" attr.type="string" for="edge" id="d8" />
<key attr.name="osmid" attr.type="string" for="edge" id="d7" />
<edge id="0" source="4331489627" target="4331489577">
<data key="d7">435211336</data>
<data key="d13">Calle Carretera</data>
<data key="d9">residential</data>
<data key="d10">False</data>
<data key="d11">52.45</data>
<data key="d12">LINESTRING (-4.8413613 39.4799045, -4.8414814 39.4798489, -4.8419449 39.4797838)</data>
</edge>
<edge id="0" source="982621562" target="946409159">
<data key="d7">483537143</data>
<data key="d14">CM-4106</data>
<data key="d9">secondary</data>
<data key="d10">False</data>
<data key="d11">104.66499999999999</data>
<data key="d12">LINESTRING (-4.8366071 39.4783468, -4.8368979 39.4789602, -4.8371678 39.4791592)</data>
</edge>
</graph></xml>
""")
keys = {}
for key in root.findall('graph/key'):
keys[key.get('attr.name')] = key.get('id')
key_name = keys['name']
key_length = keys['length']
out = {}
for edge in root.findall('graph/edge'):
data = dict((d.get('key'), d.text) for d in edge.findall('data'))
value = (data.get(key_name, 'noName'), data[key_length])
out[(edge.get('source'), edge.get('target'))] = value
print(out)
note that you get a None
for the second edge now. before it was "missing" because you were telling it to be filtered out. instead, my code creates a dictionary based on the xml and then always populates the values in out
with tuples containing two elements.