Search code examples
mysqldatehour

MySQL query to capture the working time between events


I am trying to create a query that will capture the working hours taken for an event to occur. In the data below, I'd want to show the working hours taken for an account to go from ACTIVATED to DEACTIVATED.

ENCODEDKEY  TRANSACTIONID   LOANPRODUCTKEY  TIMESTAMP              TYPE
1           2067            aa1             2015/02/06 15:29:00    LOAN_PRODUCT_ACTIVATED
2           2162            aa1             2015/02/16 14:07:00    LOAN_PRODUCT_EDITED
3           2666            aa1             2015/02/16 15:29:00    LOAN_PRODUCT_DEACTIVATED
4           3456            aa2             2015/03/06 12:01:00    LOAN_PRODUCT_ACTIVATED
5           3478            aa2             2015/03/08 13:15:00    LOAN_PRODUCT_EDITED
6           3908            aa2             2015/03/18 13:15:00    LOAN_PRODUCT_DEACTIVATED

So the results would be something like

LOANPRODUCTKEY          TIME
aa1                     24:00:00
aa2                     12:00:00

(I know those numbers are wrong!)

I also need it to only consider working hours (i.e 9am to 5pm) Is this possible?

Thank you to anyone who can help me.

Update. Big thankyou to those who've helped so far!

So I have managed to create a query which will return the correct dates for the activated and deactivated typers per loanproductkey. However, I am still struggling to work out the working hours between the two calculated dates. My query is as follows:

SELECT

att.LOANPRODUCTKEY
,sub1.time_activated
,sub2.time_deactivated

from 
activity att

left join (select
min(att.TIMESTAMP) as time_activated
,att.loanproductkey
from
activity att
where
att.`TYPE` = "LOAN_PRODUCT_ACTIVATED"
group by
att.LOANPRODUCTKEY) AS sub1
ON att.LOANPRODUCTKEY = sub1.LOANPRODUCTKEY

left join 
(select
max(att.timestamp) as time_deactivated
,att.LOANPRODUCTKEY
from
activity att
where
att.`TYPE` = "LOAN_PRODUCT_DEACTIVATED"
group by
att.LOANPRODUCTKEY) AS sub2
ON att.LOANPRODUCTKEY = sub2.LOANPRODUCTKEY

group by
att.loanproductkey

Solution

  • Ok, so I think the query below should work although it probably can be tweaked a lot. It's most likely possible to do it smarter too I guess :)

    The logic is as follows:

    1. First it uses a number table to generate the sequence of dates between the start and end dates for every loanproductkey (this is done in a derived table).

    2. Then it joins this table and counts the number of whole working days between the start and end (exclusive) and adds this (times 240 to get minutes) to the number of minutes that is the difference between the starting days time and 17:00, and the difference in minutes between 09:00 and the ending days time.

    So the calculation is this:

    (minutes from start time of first day to 17:00) -- eg. '17:00:00'-'15:29:00'
    +
    (minutes from 09:00 to end time of last day) -- eg. '15:29:00'-'09:00'
    +
    (number of working days between start and end) * 240
    

    It's possible that you might have to fine-tune the math a bit but the logic should be sound.

    Assumptions made:

    • Weekends are weekday 5 and 6 as returned by the weekday function - this might be dependent on local server settings, I'm not sure.

    • That there exists a table called numbers with a column num containing digits 1 through at least the range needed to cover your maximum date range from start to end. If you don't have this I've described how you can create it at the end.

    This gets a result like:

    LOANPRODUCTKEY  total_min   total_time (hhh:mm:ss)
    aa1             4800        80:00:00
    aa2             5834        97:14:00
    

    Sample SQL Fiddle (without the total_time column as fiddle uses java which doesn't like large values for the hour part.

    SELECT 
        t_start.LOANPRODUCTKEY
    
        , (
            TIMESTAMPDIFF(MINUTE, CAST(t_start.timestamp AS time), CAST('17:00:00' AS time))
          + TIMESTAMPDIFF(MINUTE, CAST('09:00:00' AS time), CAST(t_end.timestamp AS time))
          + COUNT(WEEKDAY(t_start.timestamp) NOT IN (5,6)) * 8 * 60
        ) AS total_minutes
    
        , SEC_TO_TIME(
            TIMESTAMPDIFF(SECOND, CAST(t_start.timestamp AS time), CAST('17:00:00' AS time)) 
          + TIMESTAMPDIFF(SECOND, CAST('09:00:00' AS time), CAST(t_end.timestamp AS time)) 
          + COUNT(WEEKDAY(t_start.timestamp) NOT IN (5,6)) * 8 * 60 * 60
        ) AS total_time
    
    FROM 
        t t_start
    JOIN 
        t t_end ON t_start.LOANPRODUCTKEY = t_end.LOANPRODUCTKEY
    JOIN 
    (
        SELECT 
            ts.LOANPRODUCTKEY
            , DATE(DATE_ADD(ts.timestamp,INTERVAL num DAY)) AS datesSeries
        FROM 
            t ts
        JOIN 
            t te ON ts.LOANPRODUCTKEY = te.LOANPRODUCTKEY
        CROSS JOIN
            numbers r
        WHERE 
            num < DATEDIFF(te.timestamp, ts.timestamp) 
         AND 
            ts.TYPE = 'LOAN_PRODUCT_ACTIVATED'
         AND 
            te.TYPE = 'LOAN_PRODUCT_DEACTIVATED'
    ) dates ON t_start.LOANPRODUCTKEY = dates.LOANPRODUCTKEY
    
    WHERE 
        t_start.TYPE = 'LOAN_PRODUCT_ACTIVATED' AND t_end.TYPE = 'LOAN_PRODUCT_DEACTIVATED'
    GROUP BY 
        t_start.LOANPRODUCTKEY, t_start.TIMESTAMP, t_end.TIMESTAMP
    ORDER BY 
        t_start.LOANPRODUCTKEY;
    

    If you don't already have a suitable table with a number sequence that covers the maximum number of days between the start and end dates you can create a table filled with the number 1-1000 using the query below, which I took from this answer.

    CREATE TABLE numbers (num int primary key);
    
    INSERT INTO numbers
    SELECT SEQ.SeqValue
    FROM (
        SELECT (HUNDREDS.SeqValue + TENS.SeqValue + ONES.SeqValue) SeqValue
        FROM (
            SELECT 0 SeqValue UNION ALL SELECT 1 SeqValue UNION ALL
            SELECT 2 SeqValue UNION ALL SELECT 3 SeqValue UNION ALL
            SELECT 4 SeqValue UNION ALL SELECT 5 SeqValue UNION ALL
            SELECT 6 SeqValue UNION ALL SELECT 7 SeqValue UNION ALL
            SELECT 8 SeqValue UNION ALL SELECT 9 SeqValue
        ) ONES
        CROSS JOIN (
            SELECT  0 SeqValue UNION ALL SELECT 10 SeqValue UNION ALL
            SELECT 20 SeqValue UNION ALL SELECT 30 SeqValue UNION ALL
            SELECT 40 SeqValue UNION ALL SELECT 50 SeqValue UNION ALL
            SELECT 60 SeqValue UNION ALL SELECT 70 SeqValue UNION ALL
            SELECT 80 SeqValue UNION ALL SELECT 90 SeqValue
        ) TENS
        CROSS JOIN (
            SELECT   0 SeqValue UNION ALL SELECT 100 SeqValue UNION ALL
            SELECT 200 SeqValue UNION ALL SELECT 300 SeqValue UNION ALL
            SELECT 400 SeqValue UNION ALL SELECT 500 SeqValue UNION ALL
            SELECT 600 SeqValue UNION ALL SELECT 700 SeqValue UNION ALL
            SELECT 800 SeqValue UNION ALL SELECT 900 SeqValue
        ) HUNDREDS
    ) SEQ WHERE SEQ.SeqValue > 0;