Search code examples
sqlgoogle-bigqueryaggregate-functionsdate-arithmetic

Find average time between records


I have the following data in a BigQuery table (call logs) :

number called_at direction staff
1111 2023-01-02 13:00 IN John
1111 2023-01-03 12:00 IN John
1111 2023-01-04 18:00 OUT John
2222 2023-01-05 12:00 IN Mark
3333 2023-01-04 08:00 IN John
3333 2023-01-04 17:00 IN John
2222 2023-01-07 09:00 OUT Mark
3333 2023-01-07 15:00 OUT John
1111 2023-01-09 11:00 IN John
1111 2023-01-13 18:00 OUT John

How can I get the average time it takes for a given staff to call back numbers.

Or in other words, find using SQL the time difference between INs and OUTs (if any) for a given number and do the average for each staff.

EDIT: I'm excepting this :

staff avg_hour
John 78.3
Mark 45

Solution

  • The idea is to first calculate the return time for each call, and then calculate tha average for each staff. You can achieve this with CTE or with a nested query based on the following code, which I'll divide into two queries for clarity, but there's no need to actually create a table.

    This query will create a table with only the IN calls, and the first time they have been called back. Since you stated that you're counting time between, and not days, I'm assuming you can't have two in calls at the same time, because if you can, duplicates will disappear as you group the query:

    CREATE TABLE callbacks AS
    SELECT a.number
        , a.called_at
        , a.direction
        , a.staff
        , min(b.called_at) as callback
        , min(b.called_at) - a.called_at as return_time
    FROM (select * from your_table where direction = 'IN') a
    INNER JOIN (select * from your_table where direction = 'OUT') b
        ON a.number = b.number
            AND a.staff = b.staff
            AND a.called_at <= b.called_at
    GROUP BY a.number, a.called_at, a.direction, a.staff
    

    Now all you have to do is to count the average time each staff takes to return calls. Be aware that only returned calls are going to be counted, and in your example, both lines 1 and 2 will be counted as returned by line 3.

    SELECT staff, avg(return_time) AS avg_return_time
    FROM callbacks
    GROUP BY staff
    

    A nested query that achieves this could be:

    SELECT staff, avg(return_time) AS avg_return_time
    FROM (
        SELECT a.number
            , a.called_at
            , a.direction
            , a.staff
            , min(b.called_at) as callback format ddmmyys.
            , min(b.called_at) - a.called_at as return_time
        FROM (select * from your_table where direction = 'IN') a
        INNER JOIN (select * from your_table where direction = 'OUT') b
            ON a.number = b.number
                AND a.staff = b.staff
                AND a.called_at <= b.called_at
        GROUP BY a.number, a.called_at, a.direction, a.staff
    )
    GROUP BY staff