sql postgresql group-by aggregate-functions date-range

Sum() up CASE values from joined tables in PostgreSQL

I'm creating a hotel booking tool for our desk/reception. My current SELECT doesn't compute the "final" sum()/CASE, meaning that as long as there are more then one overlap (hb.booking_period && cp.valid_period) per booking_ID, the CASE values in the data output only represent shares of the total price. If you sum up those shares, it matches the total price (of the stay). How can I calculate the sums of all these CASE results/shares (that fit the related booking_id)?

SELECT
    hb.booking_id,
    hb.guest_name,
    hb.room_category_id,
    hb.booking_period,
    cp.base_price,
    CASE
        WHEN upper(hb.booking_period * cp.valid_period) = upper(hb.booking_period) THEN
            (upper(hb.booking_period * cp.valid_period) - lower(hb.booking_period * cp.valid_period)) * cp.base_price
        ELSE
            (upper(hb.booking_period * cp.valid_period) - lower(hb.booking_period * cp.valid_period) + 1) * cp.base_price
    END

FROM
    hotel_bookings hb
JOIN
    category_prices cp ON hb.room_category_id = cp.room_category_id AND hb.booking_period && cp.valid_period
GROUP BY
    hb.booking_id, hb.guest_name, hb.room_category_id, hb.booking_period, cp.base_price, cp.valid_period;

booking_period and valid_period are daterange datatypes.

The CASE ensures that the intersection * base_price is calculated correctly, since otherwise for each intersection 1 stay would be lost.

Here are the table definitions:

CREATE TABLE room_categories 
(
    category_id SERIAL PRIMARY KEY,
    category_name VARCHAR(25)
);


CREATE TABLE category_prices 
(
    category_price_id SERIAL PRIMARY KEY,
    room_category_id INTEGER REFERENCES room_categories(category_id),
    valid_period daterange,
    base_price DECIMAL(10, 2)
);

CREATE TABLE hotel_bookings 
(
    booking_id SERIAL PRIMARY KEY,
    guest_name VARCHAR(255),
    room_category_id INTEGER REFERENCES room_categories(category_id),
    booking_period daterange
);

The data:

INSERT INTO room_categories (category_name) VALUES
    ('single room'),
    ('double room');

INSERT INTO category_prices (room_category_id, valid_period, base_price) VALUES
    (1, '[2023-01-01, 2023-01-31)', 80.00),
    (1, '[2023-02-01, 2023-02-28)', 85.00),
    (1, '[2023-03-01, 2023-03-31)', 88.00),
    (2, '[2023-01-01, 2023-01-31)', 100.00),
    (2, '[2023-02-01, 2023-02-28)', 105.00),
    (2, '[2023-03-01, 2023-03-31)', 108.00);

INSERT INTO hotel_bookings (guest_name, room_category_id, booking_period) VALUES
    ('John Doe', 1, '[2023-01-15, 2023-01-20)'), --correct calced, 1 intersection
    ('Jane Smith', 1, '[2023-01-30, 2023-02-02)'), -- 2 shares (need/want a sumup)
    ('Jane Smith', 1, '[2023-02-25, 2023-03-03)'),
    ('Jordan Miller', 2, '[2023-01-30, 2023-03-02)'); -- 3 shares (need/want a sumup)

Solution

sum(case when a then b else c end) is perfectly allowed. Looking at your edits with examples, my guess is that you're grouping too deep: the periods and their prices multiply your booking rows, spawning a separate entry for each hb.booking_period && cp.valid_period intersection. To collapse them into one row per booking with one valid sum, you can aggregate those, e.g. with jsonb_object_agg(). Demo at db<>fiddle:

SELECT
    hb.booking_id,
    hb.guest_name,
    hb.room_category_id,
    hb.booking_period,
    jsonb_pretty(jsonb_object_agg(cp.valid_period,cp.base_price)),
    SUM(CASE WHEN  upper(hb.booking_period * cp.valid_period) = upper(hb.booking_period) 
             THEN (upper(hb.booking_period * cp.valid_period) - lower(hb.booking_period * cp.valid_period)) * cp.base_price
             ELSE (upper(hb.booking_period * cp.valid_period) - lower(hb.booking_period * cp.valid_period) + 1) * cp.base_price
        END )
FROM hotel_bookings hb
JOIN category_prices cp 
ON hb.room_category_id = cp.room_category_id 
AND hb.booking_period && cp.valid_period
GROUP BY hb.booking_id, 
  hb.guest_name, 
  hb.room_category_id, 
  hb.booking_period;

booking_id	guest_name	room_category_id	booking_period	jsonb_pretty	sum
1	John Doe	1	[2023-01-15,2023-01-20)	{"[2023-01-01,2023-01-31)": 80.00}	400.00
2	Jane Smith	1	[2023-01-30,2023-02-02)	{"[2023-01-01,2023-01-31)": 80.00,"[2023-02-01,2023-02-28)": 85.00 }	245.00
3	Jane Smith	1	[2023-02-25,2023-03-03)	{"[2023-02-01,2023-02-28)": 85.00, "[2023-03-01,2023-03-31)": 88.00}	516.00
4	Jordan Miller	2	[2023-01-30,2023-03-02)	{"[2023-01-01,2023-01-31)": 100.00, "[2023-02-01,2023-02-28)": 105.00, "[2023-03-01,2023-03-31)": 108.00}	3248.00

The reason for that case statement is probably that you unintentionally used upper bound exclusive ranges for your category prices. If you make them upper bound inclusive, you can get rid of that whole thing and just multiply the booking_period*valid_period intersection length by the base_price: demo2

UPDATE category_prices 
SET valid_period=daterange(lower(valid_period),upper(valid_period),'[]');

SELECT
    hb.booking_id,
    hb.guest_name,
    hb.room_category_id,
    hb.booking_period,
    jsonb_pretty(jsonb_object_agg(cp.valid_period,cp.base_price)),
    SUM((  upper(hb.booking_period * cp.valid_period) 
         - lower(hb.booking_period * cp.valid_period)) 
        * cp.base_price )
FROM hotel_bookings hb
JOIN category_prices cp 
ON hb.room_category_id = cp.room_category_id 
AND hb.booking_period && cp.valid_period
GROUP BY hb.booking_id, 
  hb.guest_name, 
  hb.room_category_id, 
  hb.booking_period;

See the cheatsheet:

select raw,
       dr AS canonicalized,
       lower(dr),
       lower_inc(dr),
       upper(dr),
       upper_inc(dr)
from (values ('(2023-01-01, 2023-01-31)'),
             ('(2023-01-01, 2023-01-31]'),
             ('[2023-01-01, 2023-01-31)'),
             ('[2023-01-01, 2023-01-31]'))AS a(raw),
     lateral (select raw::daterange)AS b(dr);

raw	canonicalized	lower	lower_inc	upper	upper_inc
(2023-01-01, 2023-01-31)	[2023-01-02,2023-01-31)	2023-01-02	t	2023-01-31	f
(2023-01-01, 2023-01-31]	[2023-01-02,2023-02-01)	2023-01-02	t	2023-02-01	f
[2023-01-01, 2023-01-31)	[2023-01-01,2023-01-31)	2023-01-01	t	2023-01-31	f
[2023-01-01, 2023-01-31]	[2023-01-01,2023-02-01)	2023-01-01	t	2023-02-01	f

fiddle