The original question was - how to attach a sequence number to a set of rows which are extracted from an xml
string.
Updated script below. Adopted from the solution provided by user @T N .
I just need to know whether this script is correct or am i doing anything wrong.
Note:- this seems to produce the correct result though!
Update:06/25/24 - I am still looking for an answer.
CREATE TABLE [dbo].[UseSequence](
[UId] [bigint] NOT NULL,
[Version] [int] NOT NULL,
[Speed] [int] NOT NULL,
[TrainType] [int] NOT NULL,
[HeadEnd] [int] NOT NULL,
[Restricted] [int] NOT NULL,
[Sequence] [int] NOT NULL
);
DECLARE
@BulletindataText nvarchar(max)
,@BulletinDataXml XML
,@BulletinSpeedsXml xml
,@DigitalUID bigint
,@Version int;
;
Set @BulletindataText = '<BulletinData>
<Bul>
<BulletinSpeeds>
<BulletinSpeedRestriction>
<Speed >30</Speed>
<TrainType Numeric="2">Passenger</TrainType>
<HeadEndOnlySpeedRestriction Numeric="2">No</HeadEndOnlySpeedRestriction>
<RestrictedSpeed Numeric="2">No</RestrictedSpeed>
</BulletinSpeedRestriction>
<BulletinSpeedRestriction>
<Speed >30</Speed>
<TrainType Numeric="1">Freight</TrainType>
<HeadEndOnlySpeedRestriction Numeric="2">No</HeadEndOnlySpeedRestriction>
<RestrictedSpeed Numeric="2">No</RestrictedSpeed>
</BulletinSpeedRestriction>
<BulletinSpeedRestriction>
<Speed >30</Speed>
<TrainType Numeric="3">Intermodal</TrainType>
<HeadEndOnlySpeedRestriction Numeric="2">No</HeadEndOnlySpeedRestriction>
<RestrictedSpeed Numeric="2">No</RestrictedSpeed>
</BulletinSpeedRestriction>
<BulletinSpeedRestriction>
<Speed >30</Speed>
<TrainType Numeric="6">Commuter</TrainType>
<HeadEndOnlySpeedRestriction Numeric="2">No</HeadEndOnlySpeedRestriction>
<RestrictedSpeed Numeric="2">No</RestrictedSpeed>
</BulletinSpeedRestriction>
<BulletinSpeedRestriction>
<Speed >30</Speed>
<TrainType Numeric="5">TiltTrain</TrainType>
<HeadEndOnlySpeedRestriction Numeric="2">No</HeadEndOnlySpeedRestriction>
<RestrictedSpeed Numeric="2">No</RestrictedSpeed>
</BulletinSpeedRestriction>
<BulletinSpeedRestriction>
<Speed >30</Speed>
<TrainType Numeric="4">HighSpeedPassenger</TrainType>
<HeadEndOnlySpeedRestriction Numeric="2">No</HeadEndOnlySpeedRestriction>
<RestrictedSpeed Numeric="2">No</RestrictedSpeed>
</BulletinSpeedRestriction>
</BulletinSpeeds>
</Bul>
</BulletinData>'
;
SET @BulletinDataXml = cast(@BulletinDataText as xml)
SET @BulletinSpeedsXml = @BulletinDataXml.query('<BulletinSpeeds> {for $x in /BulletinData/Bul/BulletinSpeeds/child::* return $x} </BulletinSpeeds>');
-- Insert 1
Set @DigitalUID = 104;
Set @Version = 1;
-- Insert 2
--Set @DigitalUID = 104;
--Set @Version = 2;
INSERT INTO [dbo].[BulletinSpeedRestrictionUseSequence]
(DigitalUID
,[Version]
,Speed
,TrainType
,HeadEndOnlySpeedRestriction
,RestrictedSpeed,
[Sequence]
)
SELECT @DigitalUID,
@Version,
a.c.value('Speed[1]','int') as 'Speed',
a.c.value('TrainType[1]/@Numeric','int') as 'TrainType',
a.c.value('HeadEndOnlySpeedRestriction[1]/@Numeric','int') as 'HeadEndOnlySpeedRestriction',
a.c.value('RestrictedSpeed[1]/@Numeric','int') as 'RestrictedSpeed',
row_number() over (partition by @DigitalUID, @Version order by a.c) AS SequenceNum
FROM @BulletinSpeedsXml.nodes('/BulletinSpeeds/BulletinSpeedRestriction') a(c)
;
Result
DigitalUID | Version | Speed | TrainType | HeadEndOnlySpeedRestriction | RestrictedSpeed | Sequence |
---|---|---|---|---|---|---|
104 | 1 | 30 | 2 | 2 | 2 | 1 |
104 | 1 | 30 | 1 | 2 | 2 | 2 |
104 | 1 | 30 | 3 | 2 | 2 | 3 |
104 | 1 | 30 | 6 | 2 | 2 | 4 |
104 | 1 | 30 | 5 | 2 | 2 | 5 |
104 | 1 | 30 | 4 | 2 | 2 | 6 |
104 | 2 | 30 | 2 | 2 | 2 | 1 |
104 | 2 | 30 | 1 | 2 | 2 | 2 |
104 | 2 | 30 | 3 | 2 | 2 | 3 |
104 | 2 | 30 | 6 | 2 | 2 | 4 |
104 | 2 | 30 | 5 | 2 | 2 | 5 |
104 | 2 | 30 | 4 | 2 | 2 | 6 |
Regarding the sequence number, if you need to establish a well-defined order withing each set of values (having the same UID
and Version
), I believe an explicit SequenceNum
or Ordinal
column is the best approach. I would avoid using an IDENTITY
column that doubles as both a PK and a row order, as this could cause difficulties in the future if you need to modify the data in some way that adjusts that defined order. I think it is best to keep PK and ordinal separate.
Based on the mixed order in your posted results, I am going to assume that this reflects the order of your source data XML and that you wish to guarantee that the original order is reflected in the assigned SequenceNum
value.
I also reverse engineered some source data based on your query structure and an assumption that the @UID
, @Version
, and @BSpeedsXml
values actually originate from a table.
CREATE TABLE Data (
UID INT,
Version INT,
BSpeedsXml XML
)
INSERT Data
VALUES
(115967, 1, '<BSpeed>
<Restriction><Speed>60</Speed><TrainType Numeric="2" /></Restriction>
<Restriction><Speed>40</Speed><TrainType Numeric="1" /></Restriction>
<Restriction><Speed>40</Speed><TrainType Numeric="3" /></Restriction>
<Restriction><Speed>60</Speed><TrainType Numeric="6" /></Restriction>
<Restriction><Speed>60</Speed><TrainType Numeric="5" /></Restriction>
<Restriction><Speed>60</Speed><TrainType Numeric="4" /></Restriction>
</BSpeed>'),
(115967, 2, '<BSpeed>
<Restriction><Speed>60</Speed><TrainType Numeric="1" /></Restriction>
<Restriction><Speed>40</Speed><TrainType Numeric="2" /></Restriction>
<Restriction><Speed>40</Speed><TrainType Numeric="4" /></Restriction>
<Restriction><Speed>60</Speed><TrainType Numeric="5" /></Restriction>
<Restriction><Speed>60</Speed><TrainType Numeric="6" /></Restriction>
<Restriction><Speed>60</Speed><TrainType Numeric="3" /></Restriction>
</BSpeed>')
Based on some of the answers to this question, I have a couple of candidate solutions:
-- Calculate node position by counting preceding sibling nodes.
-- This may become excessively slow as the node count increases.
SELECT d.UID,
d.Version,
a.c.value('Speed[1]','int') as 'Speed',
a.c.value('TrainType[1]/@Numeric','int') as 'TrainType',
a.c.value('for $i in . return count(../*[. << $i]) + 1', 'int') AS SequenceNum
FROM Data d
CROSS APPLY d.BSpeedsXml.nodes('/BSpeed/Restriction') a(c)
ORDER BY UID, Version, SequenceNum
-- Generate position numbers using ROW_NUMBER() OVER(ORDER BY xml_node).
-- (This is potentially the best answer, but I cannot find documentation
-- stating that "ORDER BY xml_node" is well-defined.)
SELECT d.UID,
d.Version,
a.c.value('Speed[1]','int') as 'Speed',
a.c.value('TrainType[1]/@Numeric','int') as 'TrainType',
row_number() over (partition by UID, Version order by a.c) AS SequenceNum
FROM Data d
CROSS APPLY d.BSpeedsXml.nodes('/BSpeed/Restriction') a(c)
ORDER BY UID, Version, SequenceNum
-- Count the nodes, generate sequence numbers, and then access the data using those
-- sequence numbers as indexes.
-- Note: GENERATE_SERIES() function is only available in SQL Server 2022 and later.
SELECT d.UID,
d.Version,
a.c.value('Speed[1]','int') as 'Speed',
a.c.value('TrainType[1]/@Numeric','int') as 'TrainType',
s.value AS SequenceNum
FROM Data d
CROSS APPLY GENERATE_SERIES(1, d.BSpeedsXml.value('count(/BSpeed/Restriction)', 'int')) s
CROSS APPLY d.BSpeedsXml.nodes('/BSpeed/Restriction[sql:column("s.value")]') a(c)
ORDER BY UID, Version, SequenceNum
-- Alternative to GENERATE_SERIES() for SQL Server versions prior to 2022.
-- Count the nodes, generate sequence numbers, and then access the data using those
-- sequence numbers as indexes.
SELECT d.UID,
d.Version,
a.c.value('Speed[1]','int') as 'Speed',
a.c.value('TrainType[1]/@Numeric','int') as 'TrainType',
s.value AS SequenceNum
FROM Data d
CROSS APPLY ( -- Alternative to GENERATE_SERIES() for older SQL Server versions
SELECT TOP (d.BSpeedsXml.value('count(/BSpeed/Restriction)', 'int'))
ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS value
FROM (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) n1(value)
CROSS JOIN (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) n2(value) -- up to 100
CROSS JOIN (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) n3(value) -- up to 1000
CROSS JOIN (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) n4(value) -- up to 10,000
-- Add more CROSS JOINs if needed based on expected max count.
) s
CROSS APPLY d.BSpeedsXml.nodes('/BSpeed/Restriction[sql:column("s.value")]') a(c)
ORDER BY UID, Version, SequenceNum
All produce the same results:
UID | Version | Speed | TrainType | SequenceNum |
---|---|---|---|---|
115967 | 1 | 60 | 2 | 1 |
115967 | 1 | 40 | 1 | 2 |
115967 | 1 | 40 | 3 | 3 |
115967 | 1 | 60 | 6 | 4 |
115967 | 1 | 60 | 5 | 5 |
115967 | 1 | 60 | 4 | 6 |
115967 | 2 | 60 | 1 | 1 |
115967 | 2 | 40 | 2 | 2 |
115967 | 2 | 40 | 4 | 3 |
115967 | 2 | 60 | 5 | 4 |
115967 | 2 | 60 | 6 | 5 |
115967 | 2 | 60 | 3 | 6 |
See this db<>fiddle for a demo. (Alternate demo at SQL Fiddle.)