Search code examples
python-3.xlxmllxml.objectify

How to access xml field with lxml?


Python 3.6, Lxml, Windows 10

I am getting crazy. I want to access the item field. But I always get the error:

AttributeError: 'cython_function_or_method' object has no attribute'item'

Everything else (address fields etc...) I can access without problems. How can I access the item fields (sku, amount etc...)?

I've used this code:

import requests
from lxml import objectify

url = "URL_TO_XML_FILE"
xml_content = requests.get(url).text.encode('utf-8')

xml = objectify.fromstring(xml_content)

for sale in xml.response.sales.sale:
    for item in sale.items.item:
        print(item.sku)

Here is the beginning of the xml:

<?xml version="1.0" encoding="ISO-8859-1"?>
<getnewsalesresult xmlns="https://pmcdn.priceminister.com/res/schema/getnewsales">
  <request>
    <version>2017-08-07</version>
    <user>SELLER</user>
  </request>  

  <response>
    <lastversion>2017-08-07</lastversion>
    <sellerid>95029358</sellerid>
    <sales>

      <sale>
        <purchaseid>297453287592813953</purchaseid>
        <purchasedate>15/12/2018-19:10</purchasedate>
        <deliveryinformation>
          <shippingtype>Normal</shippingtype>
          <isfullrsl>N</isfullrsl>

          <purchasebuyerlogin><![CDATA[LOGIN]]></purchasebuyerlogin>                  
          <purchasebuyeremail>EMAIL</purchasebuyeremail>        


            <deliveryaddress>
            <civility>Mme</civility>
            <lastname><![CDATA[Lastname]]></lastname>
            <firstname><![CDATA[Firstname]]></firstname>
            <address1><![CDATA[STREET]]></address1>
            <address2><![CDATA[]]></address2>
            <zipcode>13570</zipcode>
            <city><![CDATA[Paris]]></city>

            <country><![CDATA[France]]></country>
            <countryalpha2>FX</countryalpha2>

              <phonenumber1></phonenumber1>
              <phonenumber2>PHONENUMBER</phonenumber2>

            </deliveryaddress>

        </deliveryinformation>
        <items>

          <item>
            <sku><![CDATA[SKU1]]></sku>
            <advertid>411812243030</advertid>
            <advertpricelisted>
              <amount>15.99</amount>
              <currency>EUR</currency>
            </advertpricelisted>
            <itemid>551131040</itemid>
            <headline><![CDATA[HEADLINE]]></headline>
            <itemstatus><![CDATA[REQUESTED]]></itemstatus>
            <ispreorder>N</ispreorder>
            <isnego>N</isnego>
            <negotiationcomment></negotiationcomment>
            <price>
              <amount>15.99</amount>
              <currency>EUR</currency>
            </price>
            <isrsl>N</isrsl>
            <isbn></isbn>
            <ean>4363745894373857474; </ean>
            <paymentstatus><![CDATA[INCOMING]]></paymentstatus>
            <sellerscore></sellerscore>
          </item>
        </items>
      </sale>
      <sale>

Solution

  • The problem is that items is actually a method of ObjectifiedElement, so the expression sale.items actually returns the method, because it has precedence.

    To get the 'items' object you want, you have to be more explicit about getting the attribute of sale and not looking for methods of the class first, which is the usual python order. This is what python does behind the scene when you access an attribute, and you can do it too:

    sale.__getattr__('items')
    

    This will also work (it's a dictionary-like interface to the attributes of an object):

    sale.__dict__['items']
    

    The revised code:

    import requests
    from lxml import objectify
    
    url = "URL_TO_XML_FILE"
    xml_content = requests.get(url).text.encode('utf-8')
    
    xml = objectify.fromstring(xml_content)
    
    for sale in xml.response.sales.sale:
        for item in sale.__dict__['items'].item:
            print(item.sku)