Search code examples
sql-serverxmlsql-server-2008sqlxml

Edit XML with SQL query SQL Server 2008 R2


I have the following problem and hope someone could help.

I have a SQL Server database with a couple thousand rows. Every row consist of a column with an ID and a column with XML data.

This XML data looks something like:

<record id="1">
 <field tag="aa" occ="1" lang="nl-NL" invariant="false">Jan</field>
 <field tag="aa" occ="1" lang="en-US" invariant="false">John</field>
 <field tag="aa" occ="1" lang="de-DE" invariant="false">der Jan</field>
 <field tag="aa" occ="2" lang="nl-NL" invariant="false">Jan2</field>
 <field tag="aa" occ="2" lang="en-US" invariant="false">John2</field>
 <field tag="ab" occ="1">Something</field>
 <field tag="ac" occ="1" lang="de-DE" invariant="false">Rechnung</field>
 <field tag="ac" occ="1" lang="nl-NL" invariant="false">rekening</field>
 <field tag="ad" occ="1">Something2</field>
 <field tag="ae" occ="1" lang="nl-NL" invariant="false">stoeptegel</field>
</record>

I would like to edit this XML for every record according to the following rules:

  1. For every unique occ (occurence), tag combination only 1 @invariant attribute can be true
  2. If a has @lang=en-US attribute, then @invariant has to be 'true'. Remaining fields with same occ, tag combination have to remain 'false'. (like tag aa in sample code)
  3. If a has @lang=nl-NL attribute, but no @lang=en-US, then @invariant has to be 'true' for 'nl-NL'. Remaining fields with same occ, tag combination have to remain 'false'. (like tag ac in sample code)
  4. If a occ, tag combination has only 1 instance, then @invariant has to be 'true'. So independent of @lang value. (like tag ae in sample code)

After running 1 or more SQL queries, the code should look like:

<record id="1">
 <field tag="aa" occ="1" lang="nl-NL" invariant="false">Jan</field>
 <field tag="aa" occ="1" lang="en-US" invariant="true">John</field>
 <field tag="aa" occ="1" lang="de-DE" invariant="false">der Jan</field>
 <field tag="aa" occ="2" lang="nl-NL" invariant="false">Jan2</field>
 <field tag="aa" occ="2" lang="en-US" invariant="true">John2</field>
 <field tag="ab" occ="1">Something</field>
 <field tag="ac" occ="1" lang="de-DE" invariant="false">Rechnung</field>
 <field tag="ac" occ="1" lang="nl-NL" invariant="true">rekening</field>
 <field tag="ad" occ="1">Something2</field>
 <field tag="ae" occ="1" lang="nl-NL" invariant="true">stoeptegel</field>
</record>

My problem is creating the correct SQL query, to replace all nodes for all records, according to the above rules.

So far I came up with this:

while exists 
(
select * 
from databasetable 
where xmlcolumn.exist('/record/field/@invariant[.="false"]') = 1
)

update databasetable
set xmlcolumn.modify
('replace value of (/record/field/@invariant[.="false"])[1] with "true"')

Which edits every value of @invariant into 'true'.

Could someone help me build the correct query? Thanks in advance!


Solution

  • Shred your XML and use row_number() with an order by clause that orders en-US first and nl-NL second.
    Use a second row_number() to generate a unique key for each row (ID and RowNumber).
    Store the values in a table variable.
    Get the max row number and update the XML i a loop for each row number.

    declare @Tmp table
    (
      ID int, -- Primary key in databasetable
      RowNumber int,
      Tag varchar(2),
      Occ int,
      Lang varchar(5),
      Invariant bit
      primary key (ID, RowNumber)
    );
    
    with C1 as
    (
      select T.ID, -- Primary key in databasetable
             R.F.value('@tag', 'varchar(2)') as Tag,
             R.F.value('@occ', 'int') as Occ,
             R.F.value('@lang', 'varchar(5)') as Lang
      from databasetable as T
        cross apply T.xmlcolumn.nodes('/record/field') as R(F)
    ), 
    C2 as
    (
      select ID, Tag, Occ, Lang,
             row_number() over(partition by ID order by (select 0)) as RowNumber,
             row_number() over(partition by ID, Tag, Occ 
                               order by case Lang 
                                          when 'en-US' then 1
                                          when 'nl-NL' then 2
                                          else 3
                                        end) as rnInv
      from C1
    )
    insert into @Tmp (ID, RowNumber, Tag, Occ, Lang, Invariant)
    select ID, RowNumber, Tag, Occ, Lang, case rnInv when 1 then 1 else 0 end
    from C2;
    
    declare @MaxRowNum int;
    declare @I int = 1;
    
    select @MaxRowNum = max(RowNumber)
    from @Tmp;
    
    while @I <= @MaxRowNum
    begin
      update T
      set xmlcolumn.modify('replace value of (/record/field[@tag = sql:column("Tmp.Tag") and
                                                            @occ = sql:column("Tmp.Occ") and
                                                            @lang = sql:column("Tmp.Lang")]/@invariant)[1] 
                              with sql:column("Tmp.Invariant")')
      from databasetable as T
        inner join @Tmp as Tmp
          on T.ID = Tmp.ID
      where Tmp.RowNumber = @I;
    
      set @I += 1;
    end
    

    A working sample can be found here.