Search code examples
sqlpostgresqlgreatest-n-per-grouplateral-join

Get next and previous rows ordered by column for a subset of table


I have a subset of a table:

-- Subset
+---------------------------------+---------+-----------+--+
|date_time                        |animal_id|location_id|id|
+---------------------------------+---------+-----------+--+
|2023-04-17 11:11:11.000000 +00:00|43       |55         |11|
|2023-04-17 12:12:12.000000 +00:00|44       |57         |12|
+---------------------------------+---------+-----------+--+

The table contains next and previous rows for these rows for animal_id.

How to get those next and previous rows (ordered by date_time column) for each row in the subset?

-- Previous
+---------------------------------+---------+-----------+--+
|date_time                        |animal_id|location_id|id|
+---------------------------------+---------+-----------+--+
|2023-04-17 01:01:01.000000 +00:00|43       |45         |1 |
|2023-04-17 02:02:02.000000 +00:00|44       |47         |2 |
+---------------------------------+---------+-----------+--+

The query for last rows should return the rows before each row of the subset, ordered by date_time, for the animal_id of the row in the subset.

-- Next
+---------------------------------+---------+-----------+--+
|date_time                        |animal_id|location_id|id|
+---------------------------------+---------+-----------+--+
|2023-04-17 21:21:21.000000 +00:00|43       |65         |21|
|2023-04-17 22:22:22.000000 +00:00|44       |67         |22|
+---------------------------------+---------+-----------+--+

The same as previous, but rows after.

I found a simmilar question with an answer, but ultimately failed to integrate it for my use case as the method described there works for a single row (or so I think).

Table:

create table animals__locations (
    date_time   timestamptz default CURRENT_TIMESTAMP not null,
    animal_id   integer     not null,
    location_id integer     not null,
    id          serial      primary key
);

There is no unique constraint on (animal_id, date_time), yet. (Will create one!)
If there is no next/previous row, the row can be omitted (that is, no row in the resulting output)

DB schema

Subset query

PG version: 15.2


Solution

  • A join to a LATERAL subquery should do it.

    Previous:

    SELECT prev.*
    FROM   "subset" t
    CROSS  JOIN LATERAL (
       SELECT *
       FROM   animals__locations t1
       WHERE  t1.animal_id = t.animal_id
       AND    t1.date_time < t.date_time
       ORDER  BY t1.date_time DESC
       LIMIT  1
       ) prev;
    

    Next:

    SELECT next.*
    FROM   "subset" t
    CROSS  JOIN LATERAL (
       SELECT *
       FROM   animals__locations t1
       WHERE  t1.animal_id = t.animal_id
       AND    t1.date_time > t.date_time
       ORDER  BY t1.date_time
       LIMIT  1
       ) next;
    

    fiddle

    There are various possible corner cases:

    1. If there can be null values. --> All relevant columns NOT NULL.
    2. If there can be identical (animal_id, date_time). --> Ruled out with new UNIQUE constraint.
    3. If there is no next / previous row. --> No row. That's what my queries do.

    You need an index on (animal_id, date_time) for this to be fast. Create a UNIQUE constraint on (animal_id, date_time), that rules out corner case 2. and provides that index implicitly.

    Related: