Search code examples
sqltrino

Window over query not retrieving what I need


We are trying to create a query to do an aggregation on the data below. This is a skinny table, with log data for operations happening in a set of equipment. The equipment is identified in the "ID" column. In this sample, there are two equipment. We need to total the column "Value" based on the label in column "Label_2". The aggregation starts from the latest "Ct-Primary" label, and ends when it finds the label 'Out Track'. So for equipment "C409EA83-A2C6-39F5-A8BC-C9E91BA7A756" it would start aggregating at Time "5/2/2023 14:30" and end at time "5/5/2023 6:00" and then again from time "5/9/2023 0:00" and ends at "5/11/2023 18:00." It would do the same for the rest of the equipment ID in the list.So the result I expect for this data sample would be:

ID   Total
C409EA83-A2C6-39F5-A8BC-C9E91BA7A756 153.00
97B398D7-9BBC-D036-1CC2-C828F0F22243 118.50

So far I have tried a few things with the over function, but that only picks by the ID and aggregates all the values in that ID. I tried adding an over by ID and one of the labels and it gives me the total value for "Ct-Primary" only but not the window I want to aggregate. Sample:

    Time    Value   Label_2 ID
5/2/2023 12:00  0.75    Rolling C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/2/2023 12:45  1.75    Ct-Primary  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/2/2023 14:30  9.50    Ct-Primary  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/2/2023 15:30  1.00    NU/ND WH/XT C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/2/2023 16:30  4.00    NU/ND   C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/2/2023 20:30  0.50    NU/ND   C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/2/2023 21:00  3.00    NU/ND   C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/3/2023 0:00   3.00    Casing  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/3/2023 3:00   3.00    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/3/2023 6:00   14.00   EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/3/2023 20:00  4.00    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/4/2023 0:00   7.00    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/4/2023 7:00   5.00    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/4/2023 12:00  8.00    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/4/2023 20:00  4.00    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/5/2023 0:00   6.00    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/5/2023 6:00   7.00    Out Track   C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/5/2023 10:00  0.50    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/5/2023 10:30  13.50   EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/6/2023 0:00   7.00    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/6/2023 7:00   0.50    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/6/2023 7:30   16.50   EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/7/2023 0:00   7.00    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/7/2023 7:00   0.50    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/7/2023 7:30   12.50   EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/7/2023 20:00  4.00    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/8/2023 0:00   7.00    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/8/2023 7:00   0.50    Safety Mtg  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/8/2023 7:30   13.50   EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/8/2023 21:00  3.00    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/9/2023 0:00   5.00    Ct-Primary  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/9/2023 6:00   2.25    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/9/2023 8:15   0.25    Safety Mtg  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/9/2023 8:30   10.50   EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/9/2023 19:00  5.00    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/10/2023 0:00  6.00    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/10/2023 6:00  6.50    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/10/2023 12:30 0.25    Safety Mtg  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/10/2023 12:45 0.25    Safety Mtg  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/10/2023 13:00 2.50    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/10/2023 15:30 2.50    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/10/2023 18:00 1.00    Safety Mtg  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/10/2023 19:00 5.00    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/11/2023 0:00  6.00    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/11/2023 6:00  0.50    Safety Mtg  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/11/2023 6:30  11.50   EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/11/2023 18:00 9.00    Out Track   C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/12/2023 0:00  0.50    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/12/2023 0:30  3.50    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/12/2023 4:00  0.75    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/12/2023 4:45  1.25    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/12/2023 6:00  0.50    Safety Mtg  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/12/2023 6:30  5.75    Rig Mob C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/12/2023 12:15 0.25    Safety Mtg  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/12/2023 12:30 6.00    EQ Mob  C409EA83-A2C6-39F5-A8BC-C9E91BA7A756
5/2/2023 12:00  1.00    EQ Mob  97B398D7-9BBC-D036-1CC2-C828F0F22243
5/2/2023 12:45  0.50    Safety Mtg  97B398D7-9BBC-D036-1CC2-C828F0F22243
5/2/2023 14:30  4.00    EQ Mob  97B398D7-9BBC-D036-1CC2-C828F0F22243
5/2/2023 15:30  2.50    EQ Mob  97B398D7-9BBC-D036-1CC2-C828F0F22243
5/2/2023 16:30  2.75    EQ Mob  97B398D7-9BBC-D036-1CC2-C828F0F22243
5/2/2023 20:30  0.75    EQ Mob  97B398D7-9BBC-D036-1CC2-C828F0F22243
5/2/2023 21:00  12.00   Ct-Primary  97B398D7-9BBC-D036-1CC2-C828F0F22243
5/3/2023 0:00   2.00    EQ Mob  97B398D7-9BBC-D036-1CC2-C828F0F22243
5/3/2023 3:00   1.50    EQ Mob  97B398D7-9BBC-D036-1CC2-C828F0F22243
5/3/2023 6:00   2.50    EQ Mob  97B398D7-9BBC-D036-1CC2-C828F0F22243
5/3/2023 20:00  6.00    EQ Mob  97B398D7-9BBC-D036-1CC2-C828F0F22243
5/4/2023 0:00   12.00   EQ Mob  97B398D7-9BBC-D036-1CC2-C828F0F22243
5/4/2023 7:00   6.00    EQ Mob  97B398D7-9BBC-D036-1CC2-C828F0F22243
5/4/2023 12:00  6.00    EQ Mob  97B398D7-9BBC-D036-1CC2-C828F0F22243
5/4/2023 20:00  2.50    EQ Mob  97B398D7-9BBC-D036-1CC2-C828F0F22243
5/5/2023 0:00   9.50    NU/ND   97B398D7-9BBC-D036-1CC2-C828F0F22243
5/5/2023 6:00   3.00    NU/ND   97B398D7-9BBC-D036-1CC2-C828F0F22243
5/5/2023 10:00  0.50    BP Test 97B398D7-9BBC-D036-1CC2-C828F0F22243
5/5/2023 10:30  2.50    BP Test 97B398D7-9BBC-D036-1CC2-C828F0F22243
5/6/2023 0:00   3.00    BP Test 97B398D7-9BBC-D036-1CC2-C828F0F22243
5/6/2023 7:00   3.00    BP Test 97B398D7-9BBC-D036-1CC2-C828F0F22243
5/6/2023 7:30   6.50    BP Test 97B398D7-9BBC-D036-1CC2-C828F0F22243
5/7/2023 0:00   2.50    RU/RD   97B398D7-9BBC-D036-1CC2-C828F0F22243
5/7/2023 7:00   3.00    EQ Repair   97B398D7-9BBC-D036-1CC2-C828F0F22243
5/7/2023 7:30   6.00    EQ Repair   97B398D7-9BBC-D036-1CC2-C828F0F22243
5/7/2023 20:00  2.00    EQ Repair   97B398D7-9BBC-D036-1CC2-C828F0F22243
5/8/2023 0:00   4.00    EQ Repair   97B398D7-9BBC-D036-1CC2-C828F0F22243
5/8/2023 7:00   14.50   EQ Repair   97B398D7-9BBC-D036-1CC2-C828F0F22243
5/8/2023 7:30   3.50    PU/LD BA    97B398D7-9BBC-D036-1CC2-C828F0F22243
5/8/2023 21:00  3.50    PU/LD Pipe  97B398D7-9BBC-D036-1CC2-C828F0F22243
5/9/2023 0:00   1.00    Out Track   97B398D7-9BBC-D036-1CC2-C828F0F22243
5/9/2023 6:00   1.00    PU/LD Pipe  97B398D7-9BBC-D036-1CC2-C828F0F22243
5/9/2023 8:15   0.50    Test Csg    97B398D7-9BBC-D036-1CC2-C828F0F22243
5/9/2023 8:30   7.00    PU/LD Pipe  97B398D7-9BBC-D036-1CC2-C828F0F22243
5/9/2023 19:00  1.50    RU/RD   97B398D7-9BBC-D036-1CC2-C828F0F22243

Solution

  • Syntax is in SQL Server as I dont have trino to test.

    I have broken down it into two CTEs.

    • label CTE ranks the valid labels ct-primary,out track for each id.

    label CTE outputs

    select id,time,label_2,
        row_number() over (partition by id order by time) as rn
        from log   
        where label_2 in ('ct-primary', 'out track')
    
    id time label_2 rank
    97B398D7-9BBC-D036-1CC2-C828F0F22243 2023-05-02 21:00:00.000 Ct-Primary 1
    97B398D7-9BBC-D036-1CC2-C828F0F22243 2023-05-09 00:00:00.000 Out Track 2
    C409EA83-A2C6-39F5-A8BC-C9E91BA7A756 2023-05-02 12:45:00.000 Ct-Primary 1
    C409EA83-A2C6-39F5-A8BC-C9E91BA7A756 2023-05-02 14:30:00.000 Ct-Primary 2
    C409EA83-A2C6-39F5-A8BC-C9E91BA7A756 2023-05-05 06:00:00.000 Out Track 3
    C409EA83-A2C6-39F5-A8BC-C9E91BA7A756 2023-05-09 00:00:00.000 Ct-Primary 4
    C409EA83-A2C6-39F5-A8BC-C9E91BA7A756 2023-05-11 18:00:00.000 Out Track 5
    • Next is valid_pairs CTE which SELF JOINS where the label starts with latest ct-primary and ends with out track and there would be a difference of 1 in the rank between the consecutive records, so it would return something like this
    id start_time end_time
    97B398D7-9BBC-D036-1CC2-C828F0F22243 2023-05-02 21:00:00.000 2023-05-09 00:00:00.000
    C409EA83-A2C6-39F5-A8BC-C9E91BA7A756 2023-05-02 14:30:00.000 2023-05-05 06:00:00.000
    C409EA83-A2C6-39F5-A8BC-C9E91BA7A756 2023-05-09 00:00:00.000 2023-05-11 18:00:00.000

    Finally the valid_pairs CTE is joined with the log table based on id and the time between valid pairs's start_time and end_time and calculate the aggregated value grouped by id

    select  l.id,sum(l.value) as total
    from log l
    inner join valid_pairs vp
    on l.id = vp.id and l.time between vp.start_time and vp.end_time
    group by l.id
    

    Final Query :

    Demo Fiddle

    with labels as (
        select id,time,label_2,
        row_number() over (partition by id order by time) as rank
        from log   
        where label_2 in ('ct-primary', 'out track')
    ),
    valid_pairs as
    (
        select  l1.id, l1.time as start_time, l2.time as end_time
        from labels l1
        inner join  labels l2 on l1.id = l2.id and l2.rank = l1.rank + 1
        where l1.label_2 = 'ct-primary' and l2.label_2 = 'out track'
    )
    select  l.id,sum(l.value) as total
    from log l
    inner join valid_pairs vp
    on l.id = vp.id and l.time between vp.start_time and vp.end_time
    group by l.id
    order by  l.id;
    

    Output

    id total
    97B398D7-9BBC-D036-1CC2-C828F0F22243 118.5
    C409EA83-A2C6-39F5-A8BC-C9E91BA7A756 153