Search code examples
sqlsql-serverwindow-functionsazure-sql-server

Tricky partition in SQL


I have a table like this

RID ID  DTE FLAG    AMT SUMAMT
1   1   2020-07-03 02:52:15.480 Y   10  NULL
2   1   2020-07-04 02:52:15.480 N   10  NULL
3   1   2020-07-05 02:52:15.480 N   10  NULL
4   1   2020-07-06 02:52:15.480 Y   10  NULL
5   1   2020-07-13 02:52:15.480 Y   10  NULL
6   2   2020-07-06 02:52:15.480 N   10  NULL
7   2   2020-07-13 02:52:15.480 Y   10  NULL

I need a result like this

RID ID  DTE FLAG    AMT SUMAMT
1   1   2020-07-03 02:52:15.480 Y   10  10
2   1   2020-07-04 02:52:15.480 N   10  NULL
3   1   2020-07-05 02:52:15.480 N   10  NULL
4   1   2020-07-06 02:52:15.480 Y   10  30
5   1   2020-07-13 02:52:15.480 Y   10  10
6   2   2020-07-06 02:52:15.480 N   10  NULL
7   2   2020-07-13 02:52:15.480 Y   10  20

All I need to do here is to update the SUMAMT column for 'Y' FLAG line items. Here the condition is whenever we find 'Y' FLAG, we need to check is there any 'N' flagged line items in the past based on DTE, if yes we need to take AMT of that line items and need to sum up and update the SUMAMT.

SELECT RID,ID,DTE,FLAG,AMT,SUM(AMT) OVER (PARTITION BY ID ORDER BY ID,DTE) FROM #T

I tried the above query, it is giving the running total, I don't understand how to make a partition like mentioned below...

RID ID  DTE FLAG    AMT SUMAMT
1   1   2020-07-03 02:52:15.480 Y   10  10

2   1   2020-07-04 02:52:15.480 N   10  NULL
3   1   2020-07-05 02:52:15.480 N   10  NULL
4   1   2020-07-06 02:52:15.480 Y   10  30

5   1   2020-07-13 02:52:15.480 Y   10  10

6   2   2020-07-06 02:52:15.480 N   10  NULL
7   2   2020-07-13 02:52:15.480 Y   10  20

Query to create table and insert data...

CREATE TABLE #T
(
    RID INT IDENTITY(1, 1),
    ID INT,
    DTE DATETIME,
    FLAG VARCHAR(1),
    AMT INT,
    SUMAMT INT
)

INSERT INTO #T (ID, DTE, FLAG, AMT) 
VALUES (1, GETDATE() - 10, 'Y', 10),
       (1, GETDATE() - 9, 'N', 10),
       (1, GETDATE() - 8, 'N', 10),
       (1, GETDATE() - 7, 'Y', 10),
       (1, GETDATE(), 'Y', 10),
       (2, GETDATE() - 7, 'N', 10),
       (2, GETDATE(), 'Y', 10)

Solution

  • Thanks for your sample data. it was very helpful.

    In the below query, we split the data into two parts:

    • Flag 'Y'
    • Flag 'N'

    Now, we calculate the sum of all rows of Flag 'N' which are falling between two 'Y' flags. for flag 'N' rows, there is no calculation.

    SELECT t.RID, t.id, t.dte,  t.amt,t.flag, isnull(t.amt+ot.sum_amt,amt)
    from 
    (SELECT 
    RID,ID, DTE, ISNULL(LAG(DTE,1) OVER(PARTITION BY ID ORDER BY DTE),'19000101') AS Prev_Yes
    , DTE as Current_Yes
    ,amt
    ,flag
    FROM #t as cr WHERE Flag = 'Y') as t
    OUTER APPLY
    (SELECT SUM(AMT) FROM #t
    WHERE flag = 'N' 
    AND DTE > t.Prev_Yes AND DTE < t.Current_Yes
    and ID = t.id) as ot(sum_amt) 
    UNION ALL
    SELECT RID, id, dte,  amt,flag, NULL AS SUM_AMT
    FROM #t 
    WHERE flag = 'N'
    ORDER BY rid
    
    
    +-----+----+-------------------------+-----+------+------------------+
    | RID | id |           dte           | amt | flag | (No column name) |
    +-----+----+-------------------------+-----+------+------------------+
    |   1 |  1 | 2020-07-03 09:35:10.513 |  10 | Y    | 10               |
    |   2 |  1 | 2020-07-04 09:35:10.513 |  10 | N    | NULL             |
    |   3 |  1 | 2020-07-05 09:35:10.513 |  10 | N    | NULL             |
    |   4 |  1 | 2020-07-06 09:35:10.513 |  10 | Y    | 30               |
    |   5 |  1 | 2020-07-13 09:35:10.513 |  10 | Y    | 10               |
    |   6 |  2 | 2020-07-06 09:35:10.513 |  10 | N    | NULL             |
    |   7 |  2 | 2020-07-13 09:35:10.513 |  10 | Y    | 20               |
    +-----+----+-------------------------+-----+------+------------------+