Search code examples
xmloracle-databasexmltable

Get the minimum value from XML set of nodes using XPath


I have a sub-optimal process to get the minimum value out of an XML-document. What I want to achieve is a more optimal process by using XML to process the requested data correctly.

Assume I have the following xml document in a table:

<document>
  <name>...</name>
  <address>...</address>
  <mutation>
    <mutationDate>2025-01-01T12:14:16</mutationDate>
  </mutation>
  <mutation>
    <mutationDate>2025-01-02T13:18:24</mutationDate>
  </mutation>
  <mutation>
    <mutationDate>2025-01-03T10:36:50</mutationDate>
  </mutation>
</document>

The mutationDate nodes can be accessed by plain SQL when the document is converted/interpretated using XMLTable, see example below:

with xt as
  ( select q'[<document>
  <name>...</name>
  <address>...</address>
  <mutation>
    <mutationDate>2025-01-02</mutationDate>
  </mutation>
  <mutation>
    <mutationDate>2025-01-01</mutationDate>
  </mutation>
  <mutation>
    <mutationDate>2025-01-03</mutationDate>
  </mutation>
</document>]' as xml_doc
  from dual)
select s.*
     , ( select min(regexp_substr( mutationdate_agg,  
                                   '[^;]+',  
                                   1,  
                                   level  
                                 )) value  
          from  dual  
          connect by level <=   
            length ( mutationdate_agg ) - length ( replace ( mutationdate_agg, ';' ) ) + 1
       ) as mutationdate_sql_min
       --
from   xt
     , xmltable( '/document'
                 passing xmltype(xt.xml_doc)
                 columns mutationdate_1        varchar2(256)  path 'mutation[1]/mutationDate'
                       , mutationdate_2        varchar2(256)  path 'mutation[2]/mutationDate'
                       , mutationdate_3        varchar2(256)  path 'mutation[3]/mutationDate'
                       , mutationdate_agg      varchar2(256)  path 'string-join( mutation/mutationDate, ";")'
                       --, mutationdate_xml_min      varchar2(256)  path 'min( mutation/mutationDate)'
     ) s
/

db-fiddle

Right now, as you can see, I am joining these values in XML using string-join with a semi-colon separator and then split these values afterwards again in SQL using regexp and hierarchy query 'connect by'.

When I am trying to call a min() function in XML directly, see my last (commented) column "mutationdate_xml_min", I get the following error:

ORA-19112: error raised during evaluation: 
XVM-01123: [FORG0001] Invalid value for cast/constructor
19112. 00000 -  "error raised during evaluation: %s"
*Cause:    The error function was called during evaluation of the XQuery expression.
*Action:   Check the detailed error message for the possible causes.

Someone who can help me with the XML/XPath syntax of that?

Thanks!


Solution

  • By default, fn:min() will try to convert data to double, which obviously wont' work for your date/timestamp strings. From the spec:

    • Values of type xs:untypedAtomic in $arg are cast to xs:double.

    You can explicitly cast each value to date before applying the function:

    , mutationdate_xml_min      varchar2(256)  path 'min(mutation/xs:date(mutationDate))'
    
    MUTATIONDATE_XML_MIN MUTATIONDATE_SQL_MIN
    2025-01-01+00:00 2025-01-01

    That might be OK if you only want the data part, but it loses the time, which is probably useful. Or for your original timestamp values:

    , mutationdate_xml_min      varchar2(256)  path 'min(mutation/xs:dateTime(mutationDate))'
    
    MUTATIONDATE_XML_MIN MUTATIONDATE_SQL_MIN
    2025-01-01T12:14:16.000000+00:00 2025-01-01T12:14:16

    which looks more useful.

    For the timestamp version you can extract the value as a timestamp with timezone rather than as a string, if you want:

    , mutationdate_xml_min      timestamp with time zone  path 'min(mutation/xs:dateTime(mutationDate))'
    

    It isn't a plain timestamp because the canonical dateTime representation includes the time zone, which will default to +00:00 as one has been specified - but you can cast that back to a plain timestamp to remove that.

    You could remove that in the XPath too, which would let you extract it as a plain timestamp:

    , mutationdate_xml_min      timestamp  path 'substring-before(xs:string(min(mutation/xs:dateTime(mutationDate))), "+00:00")'
    

    fiddle

    As Oracle doesn't have a time zone-aware date, you would need to strip the +00:00 for that version, which you could do in the XPath again, and then extract it as a date. But you still lose the time part.