Search code examples
pythonxmlxpathattributeerror

python3 xpath can't reach a child node (AttributeError: 'NoneType' object has no attribute 'text')


need help with some issue I didn't manage to find

I have an xml like this:

<forecast xmlns="http://weather.yandex.ru/forecast" country_id="8996ba26eb0edf7ea5a055dc16c2ccbd" part="Лен Стокгольм" link="http://pogoda.yandex.ru/stockholm/" part_id="53f767b78d8f180c28d55ebda1d07e0c" lat="59.381981" slug="stockholm" city="Стокгольм" climate="1" country="Швеция" region="10519" lon="17.956846" zoom="12" id="2464" source="Station" exactname="Стокгольм" geoid="10519">
<fact>...</fact>
<yesterday id="435077826">...</yesterday>
<informer>...</informer>
<day date="2016-04-18">
    <sunrise>05:22</sunrise>
    <sunset>20:12</sunset>
    <moon_phase code="growing-moon">14</moon_phase>
    <moonrise>15:53</moonrise>
    <moonset>04:37</moonset>
    <biomet index="3" geomag="2" low_press="1" uv="1">...</biomet>
    <day_part typeid="1" type="morning">...</day_part>
    <day_part typeid="2" type="day">...</day_part>
    <day_part typeid="3" type="evening">...</day_part>
    <day_part typeid="4" type="night">...</day_part>
    <day_part typeid="5" type="day_short">
        <temperature>11</temperature>
    </day_part>
</day>
</forecast>

(the entire xml could be reached at https://export.yandex.ru/weather-ng/forecasts/2464.xml). need to get the temperature.text (11), trying this code:

import urllib.request
import codecs
import lxml
from xml.etree import ElementTree as ET

def gen_ns(tag):
    if tag.startswith('{'):
        ns, tag = tag.split('}') 
        return ns[1:]
    else:
        return ''
with codecs.open(fname, 'r', encoding = 'utf-8') as t:
        town_tree = ET.parse(t)
        town_root = town_tree.getroot() 
        print (town_root)

        namespaces = {'ns': gen_ns(town_root.tag)}
        print (namespaces)

        for day in town_root.iterfind('ns:day', namespaces):
            date = (day.get('date'))
            print (date)
            day_temp = day.find('.//*[@type="day_short"]/temperature')  
            print (day_temp.text)

getting:

Traceback (most recent call last):
File "weather.py", line 154, in <module>
    print (day_temp.text)
AttributeError: 'NoneType' object has no attribute 'text'

what's wrong with my xpath? I can get attr of ('.//*[@type="day_short"]'), but can't get its child (temperature) text Thanks everyone!


Solution

  • The xml document contains a default namespace, and XPath has no concept of a default namespace. In XPath, you either need to map it to a prefix (like you did with day) or use other methods like local-name to determine if an element's tag name matches what you want.

    .//*[@type="day_short"]/*[local-name()='temperature']
    

    or

    day_temp = day.find('.//*[@type="day_short"]/ns:temperature', namespaces)