Search code examples
sql-serverxmlt-sqlgroupingxquery

Group XML output based on a hardcoded number using SQL


I'm having a query which gives output in XML form.

DECLARE @XMLDATA1 XML 
SET @XMLDATA1 = (


SELECT StudentId AS 'StudentID',
StudentName AS 'StudentName',
StudentAge AS 'Age',
StudentYear 'Year'

FROM Student 
ORDER BY StudentId
FOR XML PATH('StudentLine')
)

SELECT ( CAST(@XMLDATA1 AS XML ))
FOR XML PATH ('StudentRecords') , Root('Student')

And the sample output will be as following.

<StudentRecords>
    <Student>
        <StudentLine>
            <StudentID>12</StudentID>
            <StudentName>Kevin</StudentName>
            <Age>10</Age>
            <Year>5</Year>
        </StudentLine>
        <StudentLine>
            <StudentID>13</StudentID>
            <StudentName>Peter</StudentName>
            <Age>10</Age>
            <Year>5</Year>
        </StudentLine>
        <StudentLine>
            <StudentID>14</StudentID>
            <StudentName>Martin</StudentName>
            <Age>10</Age>
            <Year>5</Year>
        </StudentLine>
        <StudentLine>
            <StudentID>15</StudentID>
            <StudentName>Justin</StudentName>
            <Age>10</Age>
            <Year>5</Year>
        </StudentLine>
    </Student>
</StudentRecords>

I want to split these records from <StudentRecords> tag based on a hardcoded number inside the sql query. As an example if I want to split to 3 records the xml will be as following.

<StudentRecords>
    <Student>
        <StudentLine>
            <StudentID>12</StudentID>
            <StudentName>Kevin</StudentName>
            <Age>10</Age>
            <Year>5</Year>
        </StudentLine>
        <StudentLine>
            <StudentID>13</StudentID>
            <StudentName>Peter</StudentName>
            <Age>10</Age>
            <Year>5</Year>
        </StudentLine>
        <StudentLine>
            <StudentID>14</StudentID>
            <StudentName>Martin</StudentName>
            <Age>10</Age>
            <Year>5</Year>
        </StudentLine>
    </Student>
</StudentRecords>
<StudentRecords>
    <Student>
        <StudentLine>
            <StudentID>15</StudentID>
            <StudentName>Justin</StudentName>
            <Age>10</Age>
            <Year>5</Year>
        </StudentLine>
    </Student>
</StudentRecords>

There can be scenarios like records with 100000 <StudentLine> tags and hardcoded number can be 2000. Can somebody help me to achieve this. Thank you.


Solution

  • Here is how to implement it via XQuery and FLWOR expression.

    • Number of rows in a group will be an input parameter.
    • We will use NTILE() function to create groups based on the number of rows in a group.
    • FLWOR expression has two loops: (1) iterates through groups, (2) iterates through the <StudentLine> elements in a given group.

    SQL

    -- DDL and sample data population, start
    DECLARE @tbl TABLE (StudentID INT PRIMARY KEY, StudentName VARCHAR(30), Age INT);
    INSERT INTO @tbl (StudentID, StudentName, Age) VALUES
    (1,  'Martha',  10),
    (2,  'Boris',  12),
    (3,  'Judy',  15),
    (12, 'Kevin',  19),
    (13, 'Peter',  10),
    (14, 'Martin', 18),
    (15, 'Justin', 25);
    -- DDL and sample data population, end
    
    -- denominator specifies # of rows in a group
    DECLARE @NumOfGroups DECIMAL(10,1) = (SELECT COUNT(*) FROM @tbl) / 3.0;
    SET @NumOfGroups = CEILING(@NumOfGroups);
    
    -- just to see
    SELECT groupId = NTILE(CAST(@NumOfGroups AS INT)) OVER (ORDER BY StudentID)
        , *
    FROM @tbl;
    
    SELECT (
        SELECT groupID = NTILE (CAST(@NumOfGroups AS INT)) OVER (ORDER BY StudentID)
            , *
        FROM @tbl
        FOR XML PATH('StudentLine'), TYPE, ROOT('root')
    ).query('
    
        for $gr in distinct-values(/root/StudentLine/groupID/text())
        return <StudentRecords>
                    <Student>
                {
                    for $x in /root/StudentLine[(groupID/text())[1] eq $gr]
                    return $x
                }
            </Student>
        </StudentRecords>
    
    ');
    

    Output

      <StudentRecords>
        <Student>
          <StudentLine>
            <groupID>1</groupID>
            <StudentID>1</StudentID>
            <StudentName>Martha</StudentName>
            <Age>10</Age>
          </StudentLine>
          <StudentLine>
            <groupID>1</groupID>
            <StudentID>2</StudentID>
            <StudentName>Boris</StudentName>
            <Age>12</Age>
          </StudentLine>
          <StudentLine>
            <groupID>1</groupID>
            <StudentID>3</StudentID>
            <StudentName>Judy</StudentName>
            <Age>15</Age>
          </StudentLine>
        </Student>
      </StudentRecords>
      <StudentRecords>
        <Student>
          <StudentLine>
            <groupID>2</groupID>
            <StudentID>12</StudentID>
            <StudentName>Kevin</StudentName>
            <Age>19</Age>
          </StudentLine>
          <StudentLine>
            <groupID>2</groupID>
            <StudentID>13</StudentID>
            <StudentName>Peter</StudentName>
            <Age>10</Age>
          </StudentLine>
        </Student>
      </StudentRecords>
      <StudentRecords>
        <Student>
          <StudentLine>
            <groupID>3</groupID>
            <StudentID>14</StudentID>
            <StudentName>Martin</StudentName>
            <Age>18</Age>
          </StudentLine>
          <StudentLine>
            <groupID>3</groupID>
            <StudentID>15</StudentID>
            <StudentName>Justin</StudentName>
            <Age>25</Age>
          </StudentLine>
        </Student>
      </StudentRecords>