Search code examples
sqlxmlt-sqlxml-parsing

SQL - How to insert unique id in XML tag


I get fixed XML from external system and I need in T-SQL to attach some unique ID in <document> tag and insert all data into 2 tables.

This is XML that I get:

<root>
    <document number="1234">
        <data1>
            <item1 d="100"/> 
            <item1 d="200"/> 
        </data1>
        <data2>
            <item2 d="111"/> 
            <item2 d="222"/> 
        </data2>
    </document> 
    <document number="1234">
        <data1>
            <item1 d="300"/> 
            <item1 d="400"/> 
        </data1>
    </document> 
    <document number="1234">
        <data1>
            <item1 d="500"/> 
        </data1>
        <data2>
            <item2 d="555"/> 
        </data2>
    </document> 
</root>

I need to insert above XML into 2 tables tData1 and tData2 with unique docid at the <document> level.

Table tData1 should look like this:

docid    number     d
---------------------
    1      1234   100
    1      1234   200
    2      1234   300
    2      1234   400
    3      1234   500

Table tData2 should look like this:

docid    number    d
---------------------
    1      1234   111
    1      1234   222
    3      1234   555

How I can insert docid field while converting above XML into these tables?


Solution

  • You can use nodes in a CROSS APPLY, like this:

    DECLARE @x XML='<root>
        <document number="1234">
            <data1>
                <item1 d="100"/>
                <item1 d="200"/>
            </data1>
            <data2>
                <item2 d="111"/>
                <item2 d="222"/>
            </data2>
        </document>
        <document number="1234">
            <data1>
                <item1 d="300"/>
                <item1 d="400"/>
            </data1>
        </document>
        <document number="1234">
            <data1>
                <item1 d="500"/>
            </data1>
            <data2>
                <item2 d="555"/>
            </data2>
        </document>
    </root>'
    
    SELECT Q1.docid, Q1.number, Q2.d
    FROM (
        SELECT  ROW_NUMBER() OVER (ORDER BY T1.N1) AS docid,
                N1.value('@number','varchar(10)') AS number,
                N1.query('.') AS X
        FROM @x.nodes('root/document') T1(N1)
    ) Q1
    CROSS APPLY (
        SELECT N2.value('@d','varchar(10)') AS d
        FROM Q1.x.nodes('document/data1/item1') T2(N2)
    ) Q2
    
    SELECT Q1.docid, Q1.number, Q2.d
    FROM (
        SELECT  ROW_NUMBER() OVER (ORDER BY T1.N1) AS docid,
                N1.value('@number','varchar(10)') AS number,
                N1.query('.') AS X
        FROM @x.nodes('root/document') T1(N1)
    ) Q1
    CROSS APPLY (
        SELECT N2.value('@d','varchar(10)') AS d
        FROM Q1.x.nodes('document/data2/item2') T2(N2)
    ) Q2
    

    To generate docid I have used ROW_NUMBER over the node column, as explained in http://dataeducation.com/uniquely-identifying-xml-nodes-with-dense_rank/.