Search code examples
sql-serverxml-parsingxquery

Retrieve associated value from next node for each tag


I have the following XML:

<Envelope format="ProceedoOrderTransaction2.1">
  <Sender>SENDER</Sender>
  <Receiver>RECEIVER</Receiver>
  <EnvelopeID>xxxxx</EnvelopeID>
  <Date>2021-05-06</Date>
  <Time>11:59:46</Time>
  <NumberOfOrder>1</NumberOfOrder>
  <Order>
    <Header>
      <OrderNumber>POXXXXX</OrderNumber>
    </Header>
    <Lines>
      <Line>
        <LineNumber>1</LineNumber>
        <ItemName>Ipsum Lorum</ItemName>
        <SupplierArticleNumber>999999</SupplierArticleNumber>
        <UnitPrice vatRate="25.0">50</UnitPrice>
        <UnitPriceBasis>1</UnitPriceBasis>
        <OrderedQuantity unit="Styck">200</OrderedQuantity>
        <AdditionalItemProperty Key="ARTIKELNUMMER" Description="Unik ordermärkning (artikelnummer):" />
        <Value>999999</Value>
        <AdditionalItemProperty Key="BESKRIVNING" Description="Kort beskrivning:" />
        <Value>Ipsum Lorum</Value>
        <AdditionalItemProperty Key="BSKRIVNING" Description="Beskrivning:" />
        <Value>Ipsum Lorum</Value>
        <AdditionalItemProperty Key="ENHET" Description="Enhet:" />
        <Value>Styck</Value>
        <AdditionalItemProperty Key="KVANTITET" Description="Kvantitet:" />
        <Value>200</Value>
        <AdditionalItemProperty Key="PRIS" Description="Pris/Enhet (ex. moms):" />
        <Value>50</Value>
        <AdditionalItemProperty Key="VALUTA" Description="Valuta:" />
        <Value>SEK</Value>
        <Accounting>
          <AccountingLine amount="10000">
            <AccountingValue dimensionPosition="001" dimensionExternalID="ACCOUNT">xxx</AccountingValue>
            <AccountingValue dimensionPosition="002" dimensionExternalID="F1">Ipsum Lorum</AccountingValue>
            <AccountingValue dimensionPosition="005" dimensionExternalID="F3">1</AccountingValue>
            <AccountingValue dimensionPosition="010" dimensionExternalID="F2">9999</AccountingValue>
          </AccountingLine>
        </Accounting>
      </Line>
    </Lines>
  </Order>
</Envelope>

I am able to parse out all values correctly to table structure except for 1 value in a way that ensures its it associated with its tag. So where I stumble is that I am correctly getting 1 row per AdditionalItemProperty and I am able to get the Key and Description tag values, for example BESKRIVNING and Kort beskrivning:, but I can't (in a reasonable way) get the value between <Value> </Value> brackets that is also associated with each tag value. So for tag key value BESKRIVNING the associated value is 99999 which seem to be on same hierarchy level (insane I know) as the AdditionalItemProperty it is associated with. Seems like they use logic that value for a AdditionalItemProperty will be following the AdditionalItemProperty tag.

I am using SQL Server 2019. I have gotten this far:

-- Purchaseorderrowattributes
select top(10)
    i.value(N'(./Header/OrderNumber/text())[1]', 'nvarchar(30)') as OrderNumber,
    ap.value(N'(../LineNumber/text())[1]', 'nvarchar(30)') as LineNumber,
    ap.value(N'(@Description)', 'nvarchar(50)') property_description
from
    load.proceedo_orders t
outer apply 
    t.xml_data.nodes('Envelope/Order') AS i(i)
outer apply 
    i.nodes('Lines/Line/AdditionalItemProperty') as ap(ap)
where 
    file_name = @filename

Which produces the following output:

OrderNumber LineNumber  property_description
--------------------------------------------
PO170006416     1       Antal timmar
PO170006416     1       Beskrivning
PO170006416     1       Kompetensområde
PO170006416     1       Roll
PO170006416     1       Ordernummer
PO170006416     1       Timpris

I can't find a way to add the value for each property in a correct way. Since the ordering of the values will always be same as the ordering of the AdditionalItemProperty i found solution to get ordering of the AdditionalItemProperty and then i could use rownumber and i was then hoping to input the rownumber value into the bracket in

ap.value(N'(../Value/text())[1]', 'nvarchar(50)') property_description

but SQL Server throws exception that it has to be string literal.

So to be clear what I tried doing with something like:

ap.value(CONCAT( N'(../Value/text())[', CAST(ROWNUMBER as varchar) ,']'), 'nvarchar(50)') property_description

Solution

  • SQL Server uses XQuery 1.0.

    You can make use of the >> Node Comparison operator to find the Value sibling node with code similar to the following:

    -- Purchaseorderrowattributes
    select top(10)
      i.value(N'(./Header/OrderNumber/text())[1]', 'nvarchar(30)') as  OrderNumber
      ,ap.value(N'(../LineNumber/text())[1]', 'nvarchar(30)') as LineNumber
      ,ap.value(N'(@Description)', 'nvarchar(50)') property_description
      ,ap.query('
        let $property := .
        return (../Value[. >> $property][1])/text()
      ').value('.[1]', 'varchar(20)') as property_value
    from load.proceedo_orders t
    outer apply t.xml_data.nodes('Envelope/Order') AS i(i)
    outer apply i.nodes('Lines/Line/AdditionalItemProperty') as ap(ap)
    where file_name = @filename
    

    So what's going on here?

    • let $property := . is creating a reference to the current AdditionalItemProperty node.
    • ../Value[. >> $property] is ascending to the parent node, Line, and descending again to find Value nodes after the AditionalItemProperty node reference in document order, with [1] selecting the first one of those nodes.

    See the 3.5.3 Node Comparisons section for a little more detail.