Search code examples
sqlsql-serverssms-17

Minutes between dates when multiple records


Table 'audit' has fields id, old_status, new_status, & changed_at. Status contains values Open, On Hold, and Closed. I want to determine how long audit was On Hold, simple calculation:

SELECT Datediff(minute, (SELECT Min(changed_at) 
                         FROM   audit 
                         WHERE  id = 123 
                                AND new_status = 'On Hold'), 
       (SELECT Max(changed_at) 
        FROM   audit 
        WHERE  id = 123 
               AND old_status = 
                   'On Hold')) 

But, now I have an audit record that was placed On Hold multiple times. The above calculation does tell me the total time between the first time it went to On Hold and the last time it came out of On Hold, but it does not reflect that actual time it was On Hold.

Is there a query that would return that actual time?


Solution

  • For the answer below, I'm assuming that a record's initial/default status is "Open". I'm also assuming your version of SQL Server has LEAD/LAG functions.

    Let's say you have information about 3 records. All 3 records start with a status of "Open".

    • For Record 1, the status was changed 4 times:
      • It was changed from Open to On Hold at 8:00.
      • It was changed from On Hold to Open at 9:00.
      • It was changed back to On Hold at 10:00
      • It was changed to Closed at 11:00.
    • For Record 2, the status was changed twice:
      • At 8:00, it was changed from Open to On Hold.
      • At 9:00, it was changed to Closed.
    • Record 3 just has one change:
      • At 8:00, it was changed from Open to On Hold (meaning the current status is On Hold).

    Here's the data in a tabular form:

    +----+------------+------------+------------------+
    | id | old_status | new_status |    changed_at    |
    +----+------------+------------+------------------+
    |  1 | Open       | On Hold    | 2019-03-26 08:00 |
    |  1 | On Hold    | Open       | 2019-03-26 09:00 |
    |  1 | Open       | On Hold    | 2019-03-26 10:00 |
    |  1 | On Hold    | Closed     | 2019-03-26 11:00 |
    |  2 | Open       | On Hold    | 2019-03-26 08:00 |
    |  2 | On Hold    | Closed     | 2019-03-26 09:00 |
    |  3 | Open       | On Hold    | 2019-03-26 08:00 |
    +----+------------+------------+------------------+
    

    From the data, and my understanding of your problem, you want the total time a record was on hold. So, for the 3 records above:

    • Record 1 was on hold for a total of 2 hours/120 minutes: 1 hour from 8 to 9, then another hour from 10 to 11.
    • Record 2 was on hold for just 1 hour: from 8 to 9.
    • For Record 3, it's unclear what your expected results are: Would the results be from 8:00 (when it was placed on hold) to the current date/time? Or do you want to exclude this from your results?

    To begin attacking the problem, you can first use WINDOW functions to look at related results. I ended up using LAG.

    First, you can use LAG to figure out where the last change (for the record) occurred:

    SELECT 
        [id], 
        old_status,
        new_status,
        changed_at,
        prev_changed = LAG(changed_at) OVER
        (
            PARTITION BY [id]
            ORDER BY [id], changed_at
        )
    FROM audit_records
    

    This gives you the following results:

    +----+------------+------------+------------------+------------------+
    | id | old_status | new_status |    changed_at    |   prev_changed   |
    +----+------------+------------+------------------+------------------+
    |  1 | Open       | On Hold    | 2019-03-26 08:00 | NULL             |
    |  1 | On Hold    | Open       | 2019-03-26 09:00 | 2019-03-26 08:00 |
    |  1 | Open       | On Hold    | 2019-03-26 10:00 | 2019-03-26 09:00 |
    |  1 | On Hold    | Closed     | 2019-03-26 11:00 | 2019-03-26 10:00 |
    |  2 | Open       | On Hold    | 2019-03-26 08:00 | NULL             |
    |  2 | On Hold    | Closed     | 2019-03-26 09:00 | 2019-03-26 08:00 |
    |  3 | Open       | On Hold    | 2019-03-26 08:00 | NULL             |
    +----+------------+------------+------------------+------------------+
    

    Note the records with a NULL value: These are records that don't have a change before that change. So for Record 1, the change from Open to On Hold is null, since that was the first change.

    Now you can wrap this up in a CTE and calculate the number of minutes:

    WITH 
        audit_records_lead_lag([id], old_status, new_status, changed_at, prev_changed) AS
        (
            SELECT 
                [id], 
                old_status,
                new_status,
                changed_at,
                prev_changed = LAG(changed_at) OVER
                (
                    PARTITION BY [id]
                    ORDER BY [id], changed_at
                )
            FROM audit_records
        )
    SELECT 
        [id], 
        minutes_in_hold = SUM(DATEDIFF(MINUTE, prev_changed, changed_at))
    FROM audit_records_lead_lag
    WHERE
        old_status = 'On Hold'
        AND prev_changed IS NOT NULL
    GROUP BY [id]
    

    which gives you the following results:

    +----+-----------------+
    | id | minutes_in_hold |
    +----+-----------------+
    |  1 |             120 |
    |  2 |              60 |
    +----+-----------------+