Search code examples
postgresqlwindow-functions

Postgres: for each row evaluate all successive rows under conditions


I have this table:

id | datetime            | row_number 

1    2018-04-09 06:27:00   1
1    2018-04-09 14:15:00   2 
1    2018-04-09 15:25:00   3
1    2018-04-09 15:35:00   4
1    2018-04-09 15:51:00   5
1    2018-04-09 17:05:00   6
1    2018-04-10 06:42:00   7 
1    2018-04-10 16:39:00   8 
1    2018-04-10 18:58:00   9
1    2018-04-10 19:41:00   10
1    2018-04-14 17:05:00   11
1    2018-04-14 17:48:00   12 
1    2018-04-14 18:57:00   13

I'd count for each row the successive rows with time <= '01:30:00' and start the successive evaluation from the first row that doesn't meet the condition.

I try to exlplain better the question. Using windows function lag():

 SELECT id, datetime, 
        CASE WHEN datetime - lag (datetime,1)  OVER(PARTITION BY id ORDER BY datetime)   
        < '01:30:00' THEN 1 ELSE 0 END AS count
        FROM table

result is:

id | datetime            | count 

1    2018-04-09 06:27:00   0
1    2018-04-09 14:15:00   0 
1    2018-04-09 15:25:00   1
1    2018-04-09 15:35:00   1
1    2018-04-09 15:51:00   1
1    2018-04-09 17:05:00   1
1    2018-04-10 06:42:00   0 
1    2018-04-10 16:39:00   0 
1    2018-04-10 18:58:00   0
1    2018-04-10 19:41:00   1
1    2018-04-14 17:05:00   0
1    2018-04-14 17:48:00   1 
1    2018-04-14 18:57:00   1

But it's not ok for me because I want exclude row_number 5 because interval between row_number 5 and row_number 2 is > '01:30:00'. And start the new evaluation from row_number 5. The same for row_number 13.

The right output could be:

id | datetime            | count 

1    2018-04-09 06:27:00   0
1    2018-04-09 14:15:00   0 
1    2018-04-09 15:25:00   1
1    2018-04-09 15:35:00   1
1    2018-04-09 15:51:00   0
1    2018-04-09 17:05:00   1
1    2018-04-10 06:42:00   0 
1    2018-04-10 16:39:00   0 
1    2018-04-10 18:58:00   0
1    2018-04-10 19:41:00   1
1    2018-04-14 17:05:00   0
1    2018-04-14 17:48:00   1 
1    2018-04-14 18:57:00   0

So right count is 5.


Solution

  • I'd use a recursive query for this:

    WITH RECURSIVE tmp AS (
        SELECT
            id,
            datetime,
            row_number,
            0 AS counting,
            datetime AS last_start
        FROM mytable
        WHERE row_number = 1
        UNION ALL
        SELECT
            t1.id,
            t1.datetime,
            t1.row_number,
            CASE
                WHEN lateral_1.counting THEN 1
                ELSE 0
            END AS counting,
            CASE
                WHEN lateral_1.counting THEN tmp.last_start
                ELSE t1.datetime
            END AS last_start
        FROM
            mytable AS t1
        INNER JOIN
            tmp ON (t1.id = tmp.id AND t1.row_number - 1 = tmp.row_number),
        LATERAL (SELECT (t1.datetime - tmp.last_start) < '1h 30m'::interval AS counting) AS lateral_1
    )
    SELECT id, datetime, counting
    FROM tmp
    ORDER BY id, datetime;