Search code examples
sqlsql-servert-sqlwindow-functionsrank

T-SQL - Partition a running total


I've written a query that returns the size of my individual records in mb. These records contain Blob data.

I would like to partition the records in 50mb batches.

SELECT SourceId, Title, Description,
           SUM(DATALENGTH(VersionData) * 0.000001) OVER (PARTITION BY DATALENGTH(SourceId) ORDER BY SourceId) AS RunningTotal,
           RANK() OVER(ORDER BY SourceId) AS RowNo
FROM TargetContentVersion WITH(NOLOCK)

The data returned from this query currently looks like this, where RunningTotal is the running total in mb of the records:

SourceId            Title                                           RunningTotalRowNo
00Pf4000006gna3EAA  001f400000ZP5yUAAT_3 Oct 2018 (14_37_32).pdf    5.242880    1
00Pf4000006gna8EAA  001f400000ZP5yUAAT_3 Oct 2018 (14_37_38).doc    6.291456    2
00Pf4000006gnacEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_38_44).pdf    7.340032    3
00Pf4000006gnaDEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_37_41).doc    12.582912   4
00Pf4000006gnahEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_38_47).pdf    17.825792   5
00Pf4000006gnaIEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_37_46).doc    23.068672   6
00Pf4000006gnamEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_38_54).pdf    33.554432   7
00Pf4000006gnaNEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_37_52).txt    34.603008   8
00Pf4000006gnarEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_39_20).doc    35.651584   9
00Pf4000006gnaSEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_37_55).txt    40.894464   10
00Pf4000006gnawEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_39_24).doc    46.137344   11
00Pf4000006gnaXEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_38_0).txt 51.380224   12
00Pf4000006gnb1EAA  001f400000ZP5yUAAT_3 Oct 2018 (14_39_30).doc    61.865984   13
00Pf4000006gnb6EAA  001f400000ZP5yUAAT_3 Oct 2018 (14_39_50).txt    62.914560   14
00Pf4000006gnbaEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_29).doc    68.157440   15
00Pf4000006gnbBEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_39_58).txt    78.643200   16
00Pf4000006gnbfEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_34).doc    89.128960   17
00Pf4000006gnbGEAQ  001f400000ZP5yVAAT_3 Oct 2018 (14_40_7).pdf 90.177536   18
00Pf4000006gnbkEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_43).txt    91.226112   19
00Pf4000006gnbLEAQ  001f400000ZP5yVAAT_3 Oct 2018 (14_40_12).pdf    96.468992   20
00Pf4000006gnbpEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_46).txt    101.711872  21
00Pf4000006gnbQEAQ  001f400000ZP5yVAAT_3 Oct 2018 (14_40_17).pdf    112.197632  22
00Pf4000006gnbuEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_52).txt    122.683392  23
00Pf4000006gnbVEAQ  001f400000ZP5yVAAT_3 Oct 2018 (14_40_26).doc    123.731968  24
00Pf4000006gnbzEAA  001f400000ZP5yWAAT_3 Oct 2018 (14_41_0).pdf 124.780544  25
00Pf4000006gnc4EAA  001f400000ZP5yWAAT_3 Oct 2018 (14_41_5).pdf 130.023424  26
00Pf4000006gnc9EAA  001f400000ZP5yWAAT_3 Oct 2018 (14_41_11).pdf    140.509184  27
00Pf4000006gncdEAA  001f400000ZP5yWAAT_3 Oct 2018 (14_41_56).txt    145.752064  28
00Pf4000006gncEEAQ  001f400000ZP5yWAAT_3 Oct 2018 (14_41_30).doc    146.800640  29
00Pf4000006gnciEAA  001f400000ZP5yWAAT_3 Oct 2018 (14_42_3).txt 157.286400  30
00Pf4000006gncJEAQ  001f400000ZP5yWAAT_3 Oct 2018 (14_41_33).doc    162.529280  31
00Pf4000006gncKEAQ  001f400000ZP5ycAAD_3 Oct 2018 (14_48_11).txt    173.015040  32
00Pf4000006gncnEAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_12).pdf    174.063616  33
00Pf4000006gncsEAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_15).pdf    179.306496  34
00Pf4000006gncTEAQ  001f400000ZP5yWAAT_3 Oct 2018 (14_41_44).doc    189.792256  35
00Pf4000006gncxEAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_30).pdf    200.278016  36
00Pf4000006gncYEAQ  001f400000ZP5yWAAT_3 Oct 2018 (14_41_53).txt    201.326592  37
00Pf4000006gnd2EAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_46).doc    202.375168  38
00Pf4000006gnd7EAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_49).doc    207.618048  39
00Pf4000006gndbEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_23).pdf    212.860928  40
00Pf4000006gndCEAQ  001f400000ZP5yXAAT_3 Oct 2018 (14_42_54).doc    223.346688  41
00Pf4000006gndgEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_30).pdf    233.832448  42
00Pf4000006gnDhEAI  Snake_River_(5mb).jpg   239.077777  43
00Pf4000006gndHEAQ  001f400000ZP5yXAAT_3 Oct 2018 (14_43_3).txt 240.126353  44
00Pf4000006gndlEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_39).doc    241.174929  45
00Pf4000006gndMEAQ  001f400000ZP5yXAAT_3 Oct 2018 (14_43_6).txt 246.417809  46
00Pf4000006gndqEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_41).doc    251.660689  47
00Pf4000006gnDrEAI  Pizigani_1367_Chart_10MB.jpg    261.835395  48
00Pf4000006gndREAQ  001f400000ZP5yXAAT_3 Oct 2018 (14_43_11).txt    272.321155  49
00Pf4000006gndvEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_47).doc    282.806915  50
00Pf4000006gnDwEAI  Spinner_Dolphin_Indian_Ocean_07-2017.jpg    284.109019  51
00Pf4000006gndWEAQ  001f400000ZP5yYAAT_3 Oct 2018 (14_43_20).pdf    285.157595  52
00Pf4000006gnDXEAY  440 Kb.jpg  285.609143  53
00Pf4000006gne0EAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_59).txt    286.657719  54
00Pf4000006gne5EAA  001f400000ZP5yYAAT_3 Oct 2018 (14_44_2).txt 291.900599  55
00Pf4000006gneaEAA  001f400000ZP5yZAAT_3 Oct 2018 (14_44_59).txt    302.386359  56
00Pf4000006gneAEAQ  001f400000ZP5yYAAT_3 Oct 2018 (14_44_7).txt 312.872119  57
00Pf4000006gneeEAA  001f400000ZP5yZAAT_3 Oct 2018 (14_44_40).doc    323.357879  58

I would like the results to look like this where they are partitioned in 50mb batches:

  SourceId          Title                                           RunningTotalRowNo Batch
00Pf4000006gna3EAA  001f400000ZP5yUAAT_3 Oct 2018 (14_37_32).pdf    5.242880    1     1
00Pf4000006gna8EAA  001f400000ZP5yUAAT_3 Oct 2018 (14_37_38).doc    6.291456    2     1
00Pf4000006gnacEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_38_44).pdf    7.340032    3     1
00Pf4000006gnaDEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_37_41).doc    12.582912   4     1
00Pf4000006gnahEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_38_47).pdf    17.825792   5     1
00Pf4000006gnaIEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_37_46).doc    23.068672   6     1
00Pf4000006gnamEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_38_54).pdf    33.554432   7     1
00Pf4000006gnaNEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_37_52).txt    34.603008   8     1
00Pf4000006gnarEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_39_20).doc    35.651584   9     1
00Pf4000006gnaSEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_37_55).txt    40.894464   10    1
00Pf4000006gnawEAA  001f400000ZP5yUAAT_3 Oct 2018 (14_39_24).doc    46.137344   11    1
00Pf4000006gnaXEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_38_0).txt     51.380224   12    1
00Pf4000006gnb1EAA  001f400000ZP5yUAAT_3 Oct 2018 (14_39_30).doc    61.865984   13    2
00Pf4000006gnb6EAA  001f400000ZP5yUAAT_3 Oct 2018 (14_39_50).txt    62.914560   14    2
00Pf4000006gnbaEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_29).doc    68.157440   15    2
00Pf4000006gnbBEAQ  001f400000ZP5yUAAT_3 Oct 2018 (14_39_58).txt    78.643200   16    2
00Pf4000006gnbfEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_34).doc    89.128960   17    2
00Pf4000006gnbGEAQ  001f400000ZP5yVAAT_3 Oct 2018 (14_40_7).pdf     90.177536   18    2
00Pf4000006gnbkEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_43).txt    91.226112   19    2
00Pf4000006gnbLEAQ  001f400000ZP5yVAAT_3 Oct 2018 (14_40_12).pdf    96.468992   20    2
00Pf4000006gnbpEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_46).txt    101.711872  21    3
00Pf4000006gnbQEAQ  001f400000ZP5yVAAT_3 Oct 2018 (14_40_17).pdf    112.197632  22    3
00Pf4000006gnbuEAA  001f400000ZP5yVAAT_3 Oct 2018 (14_40_52).txt    122.683392  23    3
00Pf4000006gnbVEAQ  001f400000ZP5yVAAT_3 Oct 2018 (14_40_26).doc    123.731968  24    3
00Pf4000006gnbzEAA  001f400000ZP5yWAAT_3 Oct 2018 (14_41_0).pdf     124.780544  25    3
00Pf4000006gnc4EAA  001f400000ZP5yWAAT_3 Oct 2018 (14_41_5).pdf    130.023424   26    3
00Pf4000006gnc9EAA  001f400000ZP5yWAAT_3 Oct 2018 (14_41_11).pdf    140.509184  27    3
00Pf4000006gncdEAA  001f400000ZP5yWAAT_3 Oct 2018 (14_41_56).txt    145.752064  28    3
00Pf4000006gncEEAQ  001f400000ZP5yWAAT_3 Oct 2018 (14_41_30).doc    146.800640  29    3
00Pf4000006gnciEAA  001f400000ZP5yWAAT_3 Oct 2018 (14_42_3).txt    157.286400   30    4
00Pf4000006gncJEAQ  001f400000ZP5yWAAT_3 Oct 2018 (14_41_33).doc    162.529280  31    4
00Pf4000006gncKEAQ  001f400000ZP5ycAAD_3 Oct 2018 (14_48_11).txt    173.015040  32    4
00Pf4000006gncnEAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_12).pdf    174.063616  33    4
00Pf4000006gncsEAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_15).pdf    179.306496  34    4
00Pf4000006gncTEAQ  001f400000ZP5yWAAT_3 Oct 2018 (14_41_44).doc    189.792256  35    4
00Pf4000006gncxEAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_30).pdf    200.278016  36    5
00Pf4000006gncYEAQ  001f400000ZP5yWAAT_3 Oct 2018 (14_41_53).txt    201.326592  37    5
00Pf4000006gnd2EAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_46).doc    202.375168  38    5
00Pf4000006gnd7EAA  001f400000ZP5yXAAT_3 Oct 2018 (14_42_49).doc    207.618048  39    5
00Pf4000006gndbEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_23).pdf    212.860928  40    5
00Pf4000006gndCEAQ  001f400000ZP5yXAAT_3 Oct 2018 (14_42_54).doc    223.346688  41    5
00Pf4000006gndgEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_30).pdf    233.832448  42    5
00Pf4000006gnDhEAI  Snake_River_(5mb).jpg                          239.077777   43    5
00Pf4000006gndHEAQ  001f400000ZP5yXAAT_3 Oct 2018 (14_43_3).txt    240.126353   44    5
00Pf4000006gndlEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_39).doc    241.174929  45    5
00Pf4000006gndMEAQ  001f400000ZP5yXAAT_3 Oct 2018 (14_43_6).txt   246.417809    46    5
00Pf4000006gndqEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_41).doc    251.660689  47    6
00Pf4000006gnDrEAI  Pizigani_1367_Chart_10MB.jpg                   261.835395   48    6
00Pf4000006gndREAQ  001f400000ZP5yXAAT_3 Oct 2018 (14_43_11).txt    272.321155  49    6
00Pf4000006gndvEAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_47).doc    282.806915  50    6
00Pf4000006gnDwEAI  Spinner_Dolphin_Indian_Ocean_07-2017.jpg       284.109019   51    6
00Pf4000006gndWEAQ  001f400000ZP5yYAAT_3 Oct 2018 (14_43_20).pdf    285.157595  52    6
00Pf4000006gnDXEAY  440 Kb.jpg  285.609143  53
00Pf4000006gne0EAA  001f400000ZP5yYAAT_3 Oct 2018 (14_43_59).txt    286.657719  54    6
00Pf4000006gne5EAA  001f400000ZP5yYAAT_3 Oct 2018 (14_44_2).txt    291.900599   55    6
00Pf4000006gneaEAA  001f400000ZP5yZAAT_3 Oct 2018 (14_44_59).txt    302.386359  56    7
00Pf4000006gneAEAQ  001f400000ZP5yYAAT_3 Oct 2018 (14_44_7).txt    312.872119   57    7
00Pf4000006gneeEAA  001f400000ZP5yZAAT_3 Oct 2018 (14_44_40).doc   323.357879   58    7

Help would be much appreciated, thank you.


Solution

  • You can use integer division:

    SELECT ( CAST ( SUM(Datalength(versiondata) * 0.000001) 
                     OVER ( 
                       partition BY Datalength(sourceid) 
                       ORDER BY sourceid) AS INT) / 50 ) + 1 AS Batch 
    FROM   TargetContentVersion 
    

    Here's a quick sample that demonstrates how it works:

    CREATE TABLE #t (id INT IDENTITY(1,1), size NUMERIC(8,6))
    GO
    
    INSERT INTO #t 
    SELECT RAND() * 20
    GO 20 -- Create 20 sample rows with random sizes between 0 and 20
    
    SELECT id, SUM(size) OVER (ORDER BY id) AS RunningTotal, 
        (CAST(SUM(size) OVER (ORDER BY id) AS INT) / 50) + 1 AS Batch
    FROM #t
    
    id  RunningTotal    Batch
    1   2.303367        1
    2   4.049776        1
    3   19.177784       1
    4   28.637981       1
    5   29.675840       1
    6   32.781603       1
    7   33.859586       1
    8   36.633733       1
    9   39.413363       1
    10  58.004502       2
    11  70.363837       2
    12  82.897268       2
    13  83.946657       2
    14  85.623044       2
    15  87.432670       2
    16  103.304830      3
    17  103.709745      3
    18  122.165664      3
    19  126.554616      3
    20  128.019929      3