Search code examples
sqlpostgresqldatetimegroup-bylag

compute time difference between rows and conditional group by using PostgreSQL


I have data like as shown below:

+-------+-------+
| index | time  |
+-------+-------+
|     1 | 09.00 |
|     2 | 09.02 |
|     3 | 09.03 |
|     4 | 09.05 |
|     5 | 09.11 |
|     6 | 09.12 |
|     7 | 09.15 |
|     8 | 09.22 |
|     9 | 09.33 |
+-------+-------+

If the time difference between rows is <= 5 mins, assign them the same id as shown below. I expect my output to be like as shown below:

+-------+-------+
| index | time  |
+-------+-------+
|     1 | 09.00 |
|     1 | 09.02 |
|     1 | 09.03 |
|     1 | 09.05 |
|     2 | 09.11 |
|     2 | 09.12 |
|     2 | 09.15 |
|     3 | 09.22 |
|     4 | 09.33 |
+-------+-------+

I was trying something like below

    select index, (lag_time - time) from (
    select 
        index, 
        time,
        LAG(time,1) OVER (
            ORDER BY time
        ) lag_time) A

Solution

  • You just need a cumulative sum:

    select t.*,
           count(*) filter (where prev_time < time - interval '5 minute') over (order by time) as index
    from (select t.*,
                 lag(time) over (order by time) as prev_time
          from t
         ) t;
    

    Here is a db<>fiddle.