Search code examples
sql-serversubtotal

SQL Server conditional subtotal query


given the following table:

    create table #T
    (
    user_id int,
    project_id int,
    datum datetime,
    status varchar(10),
    KM int
    )

insert into #T values 
    (1, 1, '20160301 10:25', 'START', 1000), 
    (1, 1, '20160301 10:28', 'PASS', 1008), 
    (2, 2, '20160301 10:29', 'START', 2000),
    (1, 1, '20160301 11:08', 'STOP', 1045), 
    (3, 3, '20160301 10:25', 'START', 3000),
    (2, 2, '20160301 10:56', 'STOP', 2020), 
    (1, 4, '20160301 15:00', 'START', 1045), 
    (4, 5, '20160301 15:10', 'START', 400), 
    (1, 4, '20160301 15:10', 'PASS', 1060), 
    (1, 4, '20160301 15:20', 'PASS', 1080), 
    (1, 4, '20160301 15:30', 'STOP', 1080), 
    (4, 5, '20160301 15:40', 'STOP', 450),
    (3, 3, '20160301 16:25', 'STOP', 3200)

I have to sum the length of a track between START and STOP statuses for a given user and project The expected result would be this:

    user_id     project_id  datum      TOTAL_KM
----------- ----------- ---------- -----------
1           1           2016-03-01 45
1           4           2016-03-01 35
2           2           2016-03-01 20
3           3           2016-03-01 200
4           5           2016-03-01 50

How can I achieve this without using a cluster? The performance is an issue (I have over 1 million records per month and we have to keep data for several years)

Explanation: We can ignore the records with the status "PASS". Basically we have to subtract the KM value of the START record from the STOP record for a given user and project. There can be several hundred records between a START and STOP (like describes in the sample data) The date should be the date of START (in case where we have an over midnight delivery)

I think I should have a SELECT with an OVER() clause but I don't know how to formulate my query to respect those conditions.

Any idea?


Solution

  • SELECT  t.[user_id],
            t.project_id,
            cast(t.datum as date) as datum,
            t1.KM- t.KM as KM
    FROM #T t
    INNER JOIN #T t1 
        ON t.[user_id]=t1.[user_id] and t.project_id = t1.project_id 
    WHERE t.[status] = 'START' and t1.[status] = 'STOP'
    ORDER BY t.[user_id],
            t.project_id,
            cast(t.datum as date)
    

    Output:

    user_id     project_id  datum      KM
    ----------- ----------- ---------- -----------
    1           1           2016-03-01 45
    1           4           2016-03-01 35
    2           2           2016-03-01 20
    3           3           2016-03-01 200
    4           5           2016-03-01 50
    
    (5 row(s) affected)