Search code examples
pythonpython-3.xxmlxml-parsingxquery

python, xml: how to access the 3rd child by element' name


Would you help me, pleace, to get an access to elemnt with name 'id' by the following construction in Python (i have lxml and xml.etree.ElementTree libraries).

Desirable result: '0000000' Desirable method:

  1. Search in xml-document a child, where it's name is fcsProtocolEF3.
  2. Search in fcsProtocolEF3 an element with name 'id'.

It is crucial to search by element name. Not by ordinal position.

I tried to use something like this: tree.findall('{http://zakupki.gov.ru/oos/export/1}fcsProtocolEF3')[0].findall('{http://zakupki.gov.ru/oos/types/1}id')[0].text

it works, but it requires to input namespaces. XML-document have different namespaces and I don't know how to define them beforehand.

Thank you.

That would be great to use something like XQuery in SQL:

value('(/*:export/*:fcsProtocolEF3/*:id)[1]', 'nvarchar(21)')) AS [id],

XML-document:

<?xml version="1.0" encoding="UTF-8" standalone="true"?>
 <ns2:export xmlns:ns3="http://zakupki.gov.ru/oos/common/1" xmlns:ns4="http://zakupki.gov.ru/oos/base/1" xmlns:ns2="http://zakupki.gov.ru/oos/export/1" xmlns:ns10="http://zakupki.gov.ru/oos/printform/1" xmlns:ns11="http://zakupki.gov.ru/oos/control99/1" xmlns:ns9="http://zakupki.gov.ru/oos/SMTypes/1" xmlns:ns7="http://zakupki.gov.ru/oos/pprf615types/1" xmlns:ns8="http://zakupki.gov.ru/oos/EPtypes/1" xmlns:ns5="http://zakupki.gov.ru/oos/TPtypes/1" xmlns:ns6="http://zakupki.gov.ru/oos/CPtypes/1" xmlns="http://zakupki.gov.ru/oos/types/1">
   <ns2:fcsProtocolEF3 schemeVersion="10.2">
     <id>0000000</id>
     <purchaseNumber>0000000000000000</purchaseNumber>
   </ns2:fcsProtocolEF3>
 </ns2:export>

Solution

  • lxml solution:

    xml = '''<?xml version="1.0"?>
     <ns2:export xmlns:ns3="http://zakupki.gov.ru/oos/common/1" xmlns:ns4="http://zakupki.gov.ru/oos/base/1" xmlns:ns2="http://zakupki.gov.ru/oos/export/1" xmlns:ns10="http://zakupki.gov.ru/oos/printform/1" xmlns:ns11="http://zakupki.gov.ru/oos/control99/1" xmlns:ns9="http://zakupki.gov.ru/oos/SMTypes/1" xmlns:ns7="http://zakupki.gov.ru/oos/pprf615types/1" xmlns:ns8="http://zakupki.gov.ru/oos/EPtypes/1" xmlns:ns5="http://zakupki.gov.ru/oos/TPtypes/1" xmlns:ns6="http://zakupki.gov.ru/oos/CPtypes/1" xmlns="http://zakupki.gov.ru/oos/types/1">
       <ns2:fcsProtocolEF3 schemeVersion="10.2">
         <id>0000000</id>
         <purchaseNumber>0000000000000000</purchaseNumber>
       </ns2:fcsProtocolEF3>
     </ns2:export>'''
     
    from lxml import etree as et
    
    root = et.fromstring(xml)
    text = root.xpath('//*[local-name()="export"]/*[local-name()="fcsProtocolEF3"]/*[local-name()="id"]/text()')[0]
    print(text)