Search code examples
sqlsql-servert-sqlgroup-bypacket

TSQL: Query to group files by packet of a maximum size


I face a problem that could be not very difficult in a first approach but it's not as easy as it could be. I have a list of files with their sizes in Mbytes that a list from a DB table.

Example:

SELECT file_name, file_size
FROM myfiles

Result:

file_name | file_size    
----------------------
file 1    | 14
file 2    | 5
file 3    | 20
file 4    | 6

I want to implement a SQL query that group files by packet to send to an API. Files should be regrouped by packet of 20 Mb maximum. The best solution would be to return the "sending index" in a third column. Something like this:

file_name | file_size | packet_index   
-------------------------------------
file 1    | 14        | 1 
file 2    | 5         | 1
file 3    | 20        | 2
file 4    | 6         | 3

The first send will contain file 1 & 2, the second file 3 and the last one file 4. How can I determine this information in SQL please ?


Solution

  • Unfortunately, this type of problem requires a recursive CTE. You are assigning groups in order, so:

    with tt as (
          select t.*, row_number() over (order by file_name) as seqnum
          from myfiles t
         ),
         cte as (
          select file_name, file_size, file_size as total, seqnum, 1 as grp
          from tt
          where tt.seqnum = 1
          union all
          select tt.filename, tt.filesize,
                 (case when tt.filesize + cte.total > 20
                       then tt.filesize
                       else tt.filesize + cte.total 
                  end),
                 tt.seqnum,
                 (case when tt.filesize + cte.total > 20
                       then cte.grp + 1
                       else cte.grp
                  end)
          from cte join
               tt
               on tt.seqnum = cte.seqnum + 1
        )
    select *
    from cte;
    

    Here is a db<>fiddle.

    If you have more than 100 rows, then add OPTION (MAXRECURSION 0).