Search code examples
python-3.xsaxonsaxon-c

How to get XQuery response in python native types


I'm working with an XQuery on an XML file and need the output in a Python-native format. Following the documentation, I tried executing the code below, however, the output I'm getting is in <class 'saxonche.PyXdmValue'>, and I'm struggling to convert it into native Python data types.

from saxonche import *

P = PyXQueryProcessor(license=False)
Q = P.new_xquery_processor()
Q.set_context(file_name="data.xml")
res = Q.run_query_to_value(query_text=qry_str)

How do I convert the response type(<class 'saxonche.PyXdmValue'>) to python native containers?


Solution

  • As I said in a comment, a PyXdmValue represents an XDM 3.1 sequence of XDM 3.1 items which can be of various, rather different types, like XDM nodes, XDM atomic values (of different type like string, boolean, various numeric types, various date/dateTime/duration types) and function items (including maps and arrays).

    So depending on your query you need to write code taking the result structure and the type of the items into account.

    To give you one example, the query

    declare namespace map = 'http://www.w3.org/2005/xpath-functions/map';
    
    declare namespace output = 'http://www.w3.org/2010/xslt-xquery-serialization';
    
    declare option output:method 'adaptive';
    
    map:merge(
      for $city in //city
      group by $country := $city/@country, $name := $city/@name
      return map:entry($name || '(' || $country || ')', avg($city/@pop))
    )
    

    computes the average population for a city in a country and returns the result as an XDM map with xs:string key values and xs:double property values.

    An example input file would be e.g.

    <cities>
      <city name="Milano"  country="Italia"  year="1950"   pop="5.23"/>
      <city name="Milano"  country="Italia"  year="1960"   pop="5.29"/>
      <city name="Padova"  country="Italia"  year="1950"   pop="0.69"/>
      <city name="Padova"  country="Italia"  year="1960"   pop="0.93"/>
      <city name="Paris"   country="France"  year="1951"   pop="7.2"/>
      <city name="Paris"   country="France"  year="1961"   pop="7.6"/>
    </cities>
    

    In Python you could access the returned PyXdmMap as the first item [0] of the PyXdmValue and for instance use dictionary comprehension to convert that PyXdmMap into a Python dictionary:

    from saxonche import PySaxonProcessor
    
    with PySaxonProcessor() as saxon_proc:
        print(saxon_proc.version)
    
        xquery_processor = saxon_proc.new_xquery_processor()
    
        xquery_processor.set_context(file_name='cities2.xml')
    
        xdm_result1 = xquery_processor.run_query_to_value(query_file='group_cities2.xq')
    
        print(xdm_result1)
    
        python_dict1 = { key.string_value : xdm_result1[0].get(key)[0].double_value for key in xdm_result1[0].keys()}
    
        print(python_dict1)
    

    Output is e.g.

    SaxonC-HE 12.4 from Saxonica
    map{"Milano(Italia)":5.26e0,"Paris(France)":7.4e0,"Padova(Italia)":8.1e-1}
    {'Milano(Italia)': 5.26, 'Paris(France)': 7.4, 'Padova(Italia)': 0.81}
    

    To give you a different example, where a sequence of XML elements is returned and then, depending on your needs, can be converted into a Pyton list of the string value or the element serialization:

    xquery_processor.set_context(file_name='quotes.xml')
    
    xdm_result2 = xquery_processor.run_query_to_value(query_text='random-number-generator(current-dateTime())?permute(//quote)')
    
    print(xdm_result2)
    
    python_list = [value.string_value for value in xdm_result2]
    
    print(python_list)
    
    python_list = [value.to_string() for value in xdm_result2]
    
    print(python_list)
    

    Output is e.g.

    <quote>Hasta la vista, baby!</quote>
    
    <quote>Get the chopper!</quote>
    
    <quote>I'll be back.</quote>
    ['Hasta la vista, baby!', 'Get the chopper!', "I'll be back."]
    ['<quote>Hasta la vista, baby!</quote>', '<quote>Get the chopper!</quote>', "<quote>I'll be back.</quote>"]