Search code examples
sqlpostgresqlwindow-functions

LAG function that ignores rows that match on a given column


I have a table like below in my Postgres 14 database.

In my SELECT I want to add two columns to show last_contract_id and next_contract_id if they exist. The tricky part is that if the rental_start_date is the same on the row below or above I want to skip that row until it changes. So in this case I want the last_contract_id for 807 to be null and not 806 since first on row with id 3 the rental_start_date changes.

I tried to add these two lines, but that doesn't skip peers with the same rental_start_date:

lag(contract_id) over (partition by objekt_id order by id) as last_contract_id, 
lead(contract_id) over (partition by objekt_id order by id) as next_contract_id 

Schema:

CREATE TABLE object_history (
    objekt_id int4 NOT NULL,
    id serial NOT NULL,
    rental_start date NULL,
    rental_end date NULL,
    contract_id int4 NULL
);

INSERT INTO public.object_history
(objekt_id, id, rental_start, rental_end, contract_id)
VALUES(77920, 6, '2023-06-01', '2100-01-01', 807);
INSERT INTO public.object_history
(objekt_id, id, rental_start, rental_end, contract_id)
VALUES(77920, 5, '2023-06-01', '2100-01-01', 806);
INSERT INTO public.object_history
(objekt_id, id, rental_start, rental_end, contract_id)
VALUES(77920, 4, '2023-06-01', '2100-01-01', 803);
INSERT INTO public.object_history
(objekt_id, id, rental_start, rental_end, contract_id)
VALUES(77920, 3, '2023-05-01', '2023-05-31', NULL);
INSERT INTO public.object_history
(objekt_id, id, rental_start, rental_end, contract_id)
VALUES(77920, 2, '2022-01-01', '2023-04-30', 802);
INSERT INTO public.object_history
(objekt_id, id, rental_start, rental_end, contract_id)
VALUES(77920, 1, '2017-11-01', '2021-12-31', NULL);
Select * from object_history;

| objekt_id | id  | rental_start             | rental_end               | contract_id |
| --------- | --- | ------------------------ | ------------------------ | ----------- |
| 77920     | 6   | 2023-06-01T00:00:00.000Z | 2100-01-01T00:00:00.000Z | 807         |
| 77920     | 5   | 2023-06-01T00:00:00.000Z | 2100-01-01T00:00:00.000Z | 806         |
| 77920     | 4   | 2023-06-01T00:00:00.000Z | 2100-01-01T00:00:00.000Z | 803         |
| 77920     | 3   | 2023-05-01T00:00:00.000Z | 2023-05-31T00:00:00.000Z |             |
| 77920     | 2   | 2022-01-01T00:00:00.000Z | 2023-04-30T00:00:00.000Z | 802         |
| 77920     | 1   | 2017-11-01T00:00:00.000Z | 2021-12-31T00:00:00.000Z |             |

View on DB Fiddle


Solution

  • That's hard with window functions. Would need subqueries ...

    Here is an exact solution with correlated subqueries instead of window functions:

    SELECT *
          , (SELECT contract_id FROM object_history o2 WHERE o2.objekt_id = o.objekt_id AND o2.rental_start IS DISTINCT FROM o.rental_start AND o2.id < o.id ORDER BY id DESC LIMIT 1) AS last_contract_id 
          , (SELECT contract_id FROM object_history o2 WHERE o2.objekt_id = o.objekt_id AND o2.rental_start IS DISTINCT FROM o.rental_start AND o2.id > o.id ORDER BY id      LIMIT 1) AS next_contract_id
    FROM   object_history o
    ORDER  BY objekt_id, rental_start, id
    

    fiddle

    Not exactly cheap, though.