Search code examples
sqlsql-serverxmlxquerysql-openxml

How to access the next sibling of an xml tag in openxml sql


So i have an XML file which looks like below

declare @xml
xml= '<ENVELOPE>
    <BILLFIXED>
        <BILLDATE>29-Jun-2019</BILLDATE>
        <BILLREF>123</BILLREF>
        <BILLPARTY>ABC</BILLPARTY>
    </BILLFIXED>
    <BILLOP>200</BILLOP>
    <BILLCL>200</BILLCL>
    <BILLDUE>29-Jun-2019</BILLDUE>
    <BILLOVERDUE>1116</BILLOVERDUE>
    <BILLFIXED>
        <BILLDATE>30-Jun-2019</BILLDATE>
        <BILLREF>April To June -19</BILLREF>
        <BILLPARTY>efg</BILLPARTY>
    </BILLFIXED>
    <BILLOP>100</BILLOP>
    <BILLCL>100</BILLCL>
    <BILLDUE>30-Jun-2019</BILLDUE>
    <BILLOVERDUE>1115</BILLOVERDUE>
</ENVELOPE>

Im trying to read this using openxml

DECLARE  @hDoc AS INT, @SQL NVARCHAR (MAX) 

EXEC sp_xml_preparedocument @hDoc OUTPUT, @XML 

select   BILLDATE, BILLREF, BILLPARTY, BILLOP,BILLCL,BILLDUE, BILLOVERDUE
from OPENXML(@hDoc, '//BILLFIXED') 
WITH  
( 
BillDate [varchar](50) 'BILLDATE', 
BIllREF [varchar](50) 'BILLREF', 
BILLPARTY [varchar](100) 'BILLPARTY' 
,BILLOP [varchar](100) 'BILLOP' 
 BILLCL[varchar](100) 'REFERENCE', 
 BILLDUE [varchar](100) 'BILLDUE', 
 BILLOVERDUE [varchar](100) 'BILLOVERDUE'
) 

It was easy to extract <BILLFIXED> tag but not able to access the siblings tags

<BILLCL>
<BILLDUE>
<BILLOVERDUE>

any help in accessing these tags Thanks


Solution

  • I agree you should not use OPENXML, and instead use .nodes and .value.

    Unfortunately, SQL Server does not allow the sibling:: axis in XQuery, which would have made this much easier.

    You can do this purely using XQuery, by using the >> positional predicate.

    SELECT
          c.value('(BILLDATE /text())[1]', 'VARCHAR(20)') AS BILLDATE
        , c.value('(BILLREF  /text())[1]', 'VARCHAR(20)') AS BILLREF
        , c.value('(BILLPARTY/text())[1]', 'VARCHAR(20)') AS BILLPARTY
        , c.value('let $i := . return (/ENVELOPE/BILLOP     [. >> $i]/text())[1]', 'INT') AS BILLOP
        , c.value('let $i := . return (/ENVELOPE/BILLCL     [. >> $i]/text())[1]', 'INT') AS BILLCL
        , c.value('let $i := . return (/ENVELOPE/BILLDUE    [. >> $i]/text())[1]', 'VARCHAR(20)') AS BILLDUE
        , c.value('let $i := . return (/ENVELOPE/BILLOVERDUE[. >> $i]/text())[1]', 'INT') AS BILLOVERDUE
    FROM @xml.nodes('/ENVELOPE/BILLFIXED') AS t(c);
    

    What this does is as follows:

    • Shred just the /ENVELOPE/BILLFIXED nodes.
    • For each of those, return the children as normal.
    • For each sibling, do the following steps:
      • Store the current node in $i.
      • Take the first child node matching the correct name, of the parent,...
      • ...where that node is positioned after $i, so [. >> $i]
      • Take the first child node's text [1]

    db<>fiddle

    If some of the nodes could be empty or missing, then you also need to check that the sibling node is before the next BILLFIXED node

    SELECT
          c.value('(BILLDATE /text())[1]', 'VARCHAR(20)') AS BILLDATE
        , c.value('(BILLREF  /text())[1]', 'VARCHAR(20)') AS BILLREF
        , c.value('(BILLPARTY/text())[1]', 'VARCHAR(20)') AS BILLPARTY
        , c.value('let $i := . let $nxt := (/ENVELOPE/BILLFIXED[. >> $i])[1] return (/ENVELOPE/BILLOP     [. >> $i][. << $nxt]/text())[1]', 'INT') AS BILLOP
        , c.value('let $i := . let $nxt := (/ENVELOPE/BILLFIXED[. >> $i])[1] return (/ENVELOPE/BILLCL     [. >> $i][. << $nxt]/text())[1]', 'INT') AS BILLCL
        , c.value('let $i := . let $nxt := (/ENVELOPE/BILLFIXED[. >> $i])[1] return (/ENVELOPE/BILLDUE    [. >> $i][. << $nxt]/text())[1]', 'VARCHAR(20)') AS BILLDUE
        , c.value('let $i := . let $nxt := (/ENVELOPE/BILLFIXED[. >> $i])[1] return (/ENVELOPE/BILLOVERDUE[. >> $i][. << $nxt]/text())[1]', 'INT') AS BILLOVERDUE
    FROM @xml.nodes('/ENVELOPE/BILLFIXED') AS t(c);
    

    db<>fiddle


    One final option is to transform the XML into a more normalized structure, and then query that

    SELECT
          f.value('(BILLDATE   /text())[1]', 'VARCHAR(20)') AS BILLDATE
        , f.value('(BILLREF    /text())[1]', 'VARCHAR(20)') AS BILLREF
        , f.value('(BILLPARTY  /text())[1]', 'VARCHAR(20)') AS BILLPARTY
        , c.value('(BILLOP     /text())[1]', 'INT') AS BILLOP
        , c.value('(BILLCL     /text())[1]', 'INT') AS BILLCL
        , c.value('(BILLDUE    /text())[1]', 'VARCHAR(20)') AS BILLDUE
        , c.value('(BILLOVERDUE/text())[1]', 'INT') AS BILLOVERDUE
    FROM (VALUES(
      @xml.query('
        for $bf in /ENVELOPE/BILLFIXED
        let $nxt := (/ENVELOPE/BILLFIXED[. >> $bf])[1]
        return
          <ENVELOPE>
            {$bf}
            {
                if ($nxt) then
                    /ENVELOPE/*[. >> $bf][. << $nxt]
                else
                    /ENVELOPE/*[. >> $bf]
            }
          </ENVELOPE>
      ')
    ) ) v(transformed)
    CROSS APPLY v.transformed.nodes('/ENVELOPE') AS t(c)
    CROSS APPLY t.c.nodes('BILLFIXED') t2(f);
    

    db<>fiddle

    Steps are as follows:

    • Use .query to create a new XML document.
    • For each BILLFIXED node, store that in $bf
    • Store the BILLFIXED node which follows $bf in $nxt
    • Return an ENVELOPE node, containing $bf as well as all nodes which...
      • ... are after $bf
      • ... and before $nxt if there is a $nxt
    • Query that result as normal, using .nodes