Search code examples
sqlsql-server

How to generate sequence number for a set of rows while Insert?


The original question was - how to attach a sequence number to a set of rows which are extracted from an xml string.

Updated script below. Adopted from the solution provided by user @T N .

I just need to know whether this script is correct or am i doing anything wrong.

Note:- this seems to produce the correct result though!

Update:06/25/24 - I am still looking for an answer.

CREATE TABLE [dbo].[UseSequence](
    [UId] [bigint] NOT NULL,
    [Version] [int] NOT NULL,
    [Speed] [int] NOT NULL,
    [TrainType] [int] NOT NULL,
    [HeadEnd] [int] NOT NULL,
    [Restricted] [int] NOT NULL,
    [Sequence] [int] NOT NULL
);

DECLARE 
    @BulletindataText nvarchar(max)
    ,@BulletinDataXml XML
    ,@BulletinSpeedsXml xml
    ,@DigitalUID    bigint
    ,@Version int;
;

Set @BulletindataText = '<BulletinData>
    <Bul>       
        <BulletinSpeeds>
            <BulletinSpeedRestriction>
                <Speed >30</Speed>
                <TrainType Numeric="2">Passenger</TrainType>
                <HeadEndOnlySpeedRestriction Numeric="2">No</HeadEndOnlySpeedRestriction>
                <RestrictedSpeed Numeric="2">No</RestrictedSpeed>
            </BulletinSpeedRestriction>
            <BulletinSpeedRestriction>
                <Speed >30</Speed>
                <TrainType Numeric="1">Freight</TrainType>
                <HeadEndOnlySpeedRestriction Numeric="2">No</HeadEndOnlySpeedRestriction>
                <RestrictedSpeed Numeric="2">No</RestrictedSpeed>
            </BulletinSpeedRestriction>
            <BulletinSpeedRestriction>
                <Speed >30</Speed>
                <TrainType Numeric="3">Intermodal</TrainType>
                <HeadEndOnlySpeedRestriction Numeric="2">No</HeadEndOnlySpeedRestriction>
                <RestrictedSpeed Numeric="2">No</RestrictedSpeed>
            </BulletinSpeedRestriction>
            <BulletinSpeedRestriction>
                <Speed >30</Speed>
                <TrainType Numeric="6">Commuter</TrainType>
                <HeadEndOnlySpeedRestriction Numeric="2">No</HeadEndOnlySpeedRestriction>
                <RestrictedSpeed Numeric="2">No</RestrictedSpeed>
            </BulletinSpeedRestriction>
            <BulletinSpeedRestriction>
                <Speed >30</Speed>
                <TrainType Numeric="5">TiltTrain</TrainType>
                <HeadEndOnlySpeedRestriction Numeric="2">No</HeadEndOnlySpeedRestriction>
                <RestrictedSpeed Numeric="2">No</RestrictedSpeed>
            </BulletinSpeedRestriction>
            <BulletinSpeedRestriction>
                <Speed >30</Speed>
                <TrainType Numeric="4">HighSpeedPassenger</TrainType>
                <HeadEndOnlySpeedRestriction Numeric="2">No</HeadEndOnlySpeedRestriction>
                <RestrictedSpeed Numeric="2">No</RestrictedSpeed>
            </BulletinSpeedRestriction>
        </BulletinSpeeds>
    </Bul>
</BulletinData>'
;

SET @BulletinDataXml = cast(@BulletinDataText as xml)
SET @BulletinSpeedsXml =  @BulletinDataXml.query('<BulletinSpeeds> {for $x in /BulletinData/Bul/BulletinSpeeds/child::* return $x} </BulletinSpeeds>'); 


-- Insert 1
Set @DigitalUID = 104;
Set @Version = 1;

-- Insert 2
--Set @DigitalUID = 104;
--Set @Version = 2;

INSERT INTO [dbo].[BulletinSpeedRestrictionUseSequence]
(DigitalUID
,[Version]
,Speed
,TrainType
,HeadEndOnlySpeedRestriction
,RestrictedSpeed,
[Sequence]
)
SELECT  @DigitalUID,
        @Version,
        a.c.value('Speed[1]','int') as 'Speed',
        a.c.value('TrainType[1]/@Numeric','int') as 'TrainType',
        a.c.value('HeadEndOnlySpeedRestriction[1]/@Numeric','int') as 'HeadEndOnlySpeedRestriction',
        a.c.value('RestrictedSpeed[1]/@Numeric','int') as 'RestrictedSpeed',
        row_number() over (partition by @DigitalUID, @Version order by a.c) AS SequenceNum
FROM @BulletinSpeedsXml.nodes('/BulletinSpeeds/BulletinSpeedRestriction') a(c)
;

Result

DigitalUID Version Speed TrainType HeadEndOnlySpeedRestriction RestrictedSpeed Sequence
104 1 30 2 2 2 1
104 1 30 1 2 2 2
104 1 30 3 2 2 3
104 1 30 6 2 2 4
104 1 30 5 2 2 5
104 1 30 4 2 2 6
104 2 30 2 2 2 1
104 2 30 1 2 2 2
104 2 30 3 2 2 3
104 2 30 6 2 2 4
104 2 30 5 2 2 5
104 2 30 4 2 2 6

Solution

  • Regarding the sequence number, if you need to establish a well-defined order withing each set of values (having the same UID and Version), I believe an explicit SequenceNum or Ordinal column is the best approach. I would avoid using an IDENTITY column that doubles as both a PK and a row order, as this could cause difficulties in the future if you need to modify the data in some way that adjusts that defined order. I think it is best to keep PK and ordinal separate.

    Based on the mixed order in your posted results, I am going to assume that this reflects the order of your source data XML and that you wish to guarantee that the original order is reflected in the assigned SequenceNum value.

    I also reverse engineered some source data based on your query structure and an assumption that the @UID, @Version, and @BSpeedsXml values actually originate from a table.

    CREATE TABLE Data (
        UID INT,
        Version INT,
        BSpeedsXml XML
    )
    
    INSERT Data
    VALUES
        (115967, 1, '<BSpeed>
          <Restriction><Speed>60</Speed><TrainType Numeric="2" /></Restriction>
          <Restriction><Speed>40</Speed><TrainType Numeric="1" /></Restriction>
          <Restriction><Speed>40</Speed><TrainType Numeric="3" /></Restriction>
          <Restriction><Speed>60</Speed><TrainType Numeric="6" /></Restriction>
          <Restriction><Speed>60</Speed><TrainType Numeric="5" /></Restriction>
          <Restriction><Speed>60</Speed><TrainType Numeric="4" /></Restriction>
        </BSpeed>'),
        (115967, 2, '<BSpeed>
          <Restriction><Speed>60</Speed><TrainType Numeric="1" /></Restriction>
          <Restriction><Speed>40</Speed><TrainType Numeric="2" /></Restriction>
          <Restriction><Speed>40</Speed><TrainType Numeric="4" /></Restriction>
          <Restriction><Speed>60</Speed><TrainType Numeric="5" /></Restriction>
          <Restriction><Speed>60</Speed><TrainType Numeric="6" /></Restriction>
          <Restriction><Speed>60</Speed><TrainType Numeric="3" /></Restriction>
        </BSpeed>')
    

    Based on some of the answers to this question, I have a couple of candidate solutions:

    -- Calculate node position by counting preceding sibling nodes.
    -- This may become excessively slow as the node count increases.
    SELECT  d.UID,
            d.Version,
            a.c.value('Speed[1]','int') as 'Speed',
            a.c.value('TrainType[1]/@Numeric','int') as 'TrainType',
            a.c.value('for $i in . return count(../*[. << $i]) + 1', 'int') AS SequenceNum
    FROM Data d
    CROSS APPLY d.BSpeedsXml.nodes('/BSpeed/Restriction') a(c)
    ORDER BY UID, Version, SequenceNum
    
    -- Generate position numbers using ROW_NUMBER() OVER(ORDER BY xml_node).
    -- (This is potentially the best answer, but I cannot find documentation
    -- stating that "ORDER BY xml_node" is well-defined.)
    SELECT  d.UID,
            d.Version,
            a.c.value('Speed[1]','int') as 'Speed',
            a.c.value('TrainType[1]/@Numeric','int') as 'TrainType',
            row_number() over (partition by UID, Version order by a.c) AS SequenceNum
    FROM Data d
    CROSS APPLY d.BSpeedsXml.nodes('/BSpeed/Restriction') a(c)
    ORDER BY UID, Version, SequenceNum
    
    -- Count the nodes, generate sequence numbers, and then access the data using those
    -- sequence numbers as indexes.
    -- Note: GENERATE_SERIES() function is only available in SQL Server 2022 and later.
    SELECT  d.UID,
            d.Version,
            a.c.value('Speed[1]','int') as 'Speed',
            a.c.value('TrainType[1]/@Numeric','int') as 'TrainType',
            s.value AS SequenceNum
    FROM Data d
    CROSS APPLY GENERATE_SERIES(1, d.BSpeedsXml.value('count(/BSpeed/Restriction)', 'int')) s
    CROSS APPLY d.BSpeedsXml.nodes('/BSpeed/Restriction[sql:column("s.value")]') a(c)
    ORDER BY UID, Version, SequenceNum
    
    -- Alternative to GENERATE_SERIES() for SQL Server versions prior to 2022.
    -- Count the nodes, generate sequence numbers, and then access the data using those
    -- sequence numbers as indexes.
    SELECT  d.UID,
            d.Version,
            a.c.value('Speed[1]','int') as 'Speed',
            a.c.value('TrainType[1]/@Numeric','int') as 'TrainType',
            s.value AS SequenceNum
    FROM Data d
    CROSS APPLY ( -- Alternative to GENERATE_SERIES() for older SQL Server versions
        SELECT TOP (d.BSpeedsXml.value('count(/BSpeed/Restriction)', 'int'))
            ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS value
        FROM (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) n1(value)
        CROSS JOIN (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) n2(value) -- up to 100
        CROSS JOIN (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) n3(value) -- up to 1000
        CROSS JOIN (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) n4(value) -- up to 10,000
        -- Add more CROSS JOINs if needed based on expected max count.
    ) s
    CROSS APPLY d.BSpeedsXml.nodes('/BSpeed/Restriction[sql:column("s.value")]') a(c)
    ORDER BY UID, Version, SequenceNum
    

    All produce the same results:

    UID Version Speed TrainType SequenceNum
    115967 1 60 2 1
    115967 1 40 1 2
    115967 1 40 3 3
    115967 1 60 6 4
    115967 1 60 5 5
    115967 1 60 4 6
    115967 2 60 1 1
    115967 2 40 2 2
    115967 2 40 4 3
    115967 2 60 5 4
    115967 2 60 6 5
    115967 2 60 3 6

    See this db<>fiddle for a demo. (Alternate demo at SQL Fiddle.)