I have the following data in a BigQuery table (call logs) :
number | called_at | direction | staff |
---|---|---|---|
1111 | 2023-01-02 13:00 | IN | John |
1111 | 2023-01-03 12:00 | IN | John |
1111 | 2023-01-04 18:00 | OUT | John |
2222 | 2023-01-05 12:00 | IN | Mark |
3333 | 2023-01-04 08:00 | IN | John |
3333 | 2023-01-04 17:00 | IN | John |
2222 | 2023-01-07 09:00 | OUT | Mark |
3333 | 2023-01-07 15:00 | OUT | John |
1111 | 2023-01-09 11:00 | IN | John |
1111 | 2023-01-13 18:00 | OUT | John |
How can I get the average time it takes for a given staff to call back numbers.
Or in other words, find using SQL the time difference between INs and OUTs (if any) for a given number and do the average for each staff.
EDIT: I'm excepting this :
staff | avg_hour |
---|---|
John | 78.3 |
Mark | 45 |
The idea is to first calculate the return time for each call, and then calculate tha average for each staff. You can achieve this with CTE or with a nested query based on the following code, which I'll divide into two queries for clarity, but there's no need to actually create a table.
This query will create a table with only the IN calls, and the first time they have been called back. Since you stated that you're counting time between, and not days, I'm assuming you can't have two in calls at the same time, because if you can, duplicates will disappear as you group the query:
CREATE TABLE callbacks AS
SELECT a.number
, a.called_at
, a.direction
, a.staff
, min(b.called_at) as callback
, min(b.called_at) - a.called_at as return_time
FROM (select * from your_table where direction = 'IN') a
INNER JOIN (select * from your_table where direction = 'OUT') b
ON a.number = b.number
AND a.staff = b.staff
AND a.called_at <= b.called_at
GROUP BY a.number, a.called_at, a.direction, a.staff
Now all you have to do is to count the average time each staff takes to return calls. Be aware that only returned calls are going to be counted, and in your example, both lines 1 and 2 will be counted as returned by line 3.
SELECT staff, avg(return_time) AS avg_return_time
FROM callbacks
GROUP BY staff
A nested query that achieves this could be:
SELECT staff, avg(return_time) AS avg_return_time
FROM (
SELECT a.number
, a.called_at
, a.direction
, a.staff
, min(b.called_at) as callback format ddmmyys.
, min(b.called_at) - a.called_at as return_time
FROM (select * from your_table where direction = 'IN') a
INNER JOIN (select * from your_table where direction = 'OUT') b
ON a.number = b.number
AND a.staff = b.staff
AND a.called_at <= b.called_at
GROUP BY a.number, a.called_at, a.direction, a.staff
)
GROUP BY staff