I am trying to read a quote list via Python. The list looks like this:
<quotelist
xmlns="http://www.w3schools.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="quotationlist.xsd">
<quote key = "0">
<author>Author 0</author>
<text>Text 0</text>
</quote>
<quote key = "1">
<author>Author 1</author>
<text>Text 1.</text>
</quote>
<quote key = "2">
<author>Author 2</author>
<text>Text 2.</text>
</quote>
</quotelist>
I would like to have this as one day one quote, so therefore the key is the day of the year (0 to 364). But I struggle to read out day x with Python.
from xml.dom import minidom
dayOfYear = 44 #not relevant, I know how to find this out
mydoc = minidom.parse('./media/quotes.xml')
items = mydoc.getElementsByTagName('quote')
print(items)
This gives me the list of 365 quotes in format , thats what I excepted. But function is there to find the quote with the key number "dayOfYear"? Is there a way of not loading all? And how do I get the values of author and text then?
You'll have to build that data structure on your own. In this case I chose a nested dict:
items = mydoc.getElementsByTagName('quote')
output = {int(item.getAttribute('key')): {'author': item.getElementsByTagName('author')[0].firstChild.nodeValue,
'text': item.getElementsByTagName('text')[0].firstChild.nodeValue}
for item in items}
print(output)
Outputs
{0: {'author': 'Author 0',
'text': 'Text 0'},
1: {'author': 'Author 1',
'text': 'Text 1'},
2: {'author': 'Author 2',
'text': 'Text 2'}}
Then you can directly access each "day" that you want, eg output[0]
, output[1]
etc.