This is a continuation of a previous post on this site: Finding Gaps in Timestamps for Multiple Users in PostgreSQL
I am working with a dataset containing Check-In and Check-Out times for multiple office rooms over the last 5 years. One of the projects I was asked to work on was calculating the amount of time each room is busy and vacant over various time ranges (daily, weekly, monthly, etc.) assuming set operational hours (7:30am to 5pm). Deviating from my last post, there are instances of overlapping time ranges. A sample of the dataset for one day looks like this:
room_id check_in check_out
"Room D" "2014-07-18 12:23:00" "2014-07-18 12:54:00"
"Room D" "2014-07-19 09:16:00" "2014-07-19 10:30:00"
"Room D" "2014-07-19 09:10:00" "2014-07-19 10:30:00"
"Room D" "2014-07-18 08:45:00" "2014-07-18 22:40:00"
"Room 5" "2014-07-19 10:20:00" "2014-07-19 12:20:00"
"Room 5" "2014-07-18 07:59:00" "2014-07-18 09:00:00"
"Room 5" "2014-07-18 09:04:00" "2014-07-18 14:00:00"
"Room 5" "2014-07-18 07:59:00" "2014-07-18 10:00:00"
From my previous post, I was very helpfully provided with this snippet of code which works perfectly for all instances where there is no overlap as pointed out by the author:
select date_trunc('day', start_dt), room_id,
sum( least(extract(epoch from end_dt), v.epoch2) -
greatest(extract(epoch from start_dt), epoch1)
) as busy_seconds,
(epoch2 - epoch1 -
sum( least(extract(epoch from end_dt), v.epoch2) -
greatest(extract(epoch from start_dt), epoch1)
)
) as free_seconds
from rooms r cross join
(values (extract(epoch from date_trunc('day', start_dt) + interval '7 hours 30 minutes'),
extract(epoch from date_trunc('day', start_dt) + interval '17 hour')
)
) v(epoch1, epoch2)
group by date_trunc('day', start_dt), room_id
However, after digging through our data, there are more instances of overlapping time ranges than I expected. Here is the target output I would like to retrieve from the sample data above:
target_day room_id busy_time Free Time
2014-07-18 Room D 8.25 1.25
2014-07-19 Room 4 1.33 8.17
2014-07-18 Room 5 8 1.5
2014-07-19 Room 5 2 7.5
I am learning PostgreSQL right now so this problem is a little over my head. Any help or guidance would be greatly appreciated!
To handle gaps, I would recommend combining them first -- say using a CTE. The following does the logic by:
This should work, but you can validate the CTE before applying the logic in your other query (the only change is referring to the CTE rather than the base table).
As a query:
with r as (
select room_id, min(start_dt) as start_dt, max(end_dt) as end_ddt
from (select r.*,
count(*) over (filter where prev_end_dt < start_dt) over (partition by room_id date_trunc('day', start_dt) order by start_dt) as grp
from (select r.*,
max(end_dt) over (partition by room_id, date_trunc('day', start_dt) rows between unbounded preceding and 1 preceding) as prev_end_dt
from rooms r
) r
) r
group by room_id, grp
)
select date_trunc('day', start_dt), room_id,
sum( least(extract(epoch from end_dt), v.epoch2) -
greatest(extract(epoch from start_dt), epoch1)
) as busy_seconds,
(epoch2 - epoch1 -
sum( least(extract(epoch from end_dt), v.epoch2) -
greatest(extract(epoch from start_dt), epoch1)
)
) as free_seconds
from r cross join
(values (extract(epoch from date_trunc('day', start_dt) + interval '7 hours 30 minutes'),
extract(epoch from date_trunc('day', start_dt) + interval '17 hour')
)
) v(epoch1, epoch2)
group by date_trunc('day', start_dt), room_id