Search code examples
sqlpostgresqltime-serieswindow-functionsgaps-and-islands

How to group timestamps into islands (based on arbitrary gap)?


Consider this list of dates as timestamptz:

Postgres grouping dates

I grouped the dates by hand using colors: every group is separated from the next by a gap of at least 2 minutes.

I'm trying to measure how much a given user studied, by looking at when they performed an action (the data is when they finished studying a sentence.) e.g.: on the yellow block, I'd consider the user studied in one sitting, from 14:24 till 14:27, or roughly 3 minutes in a row.

I see how I could group these dates with a programming language by going through all of the dates and looking for the gap between two rows.

My question is: how would go about grouping dates in this way with Postgres?

(Looking for 'gaps' on Google or SO brings too many irrelevant results; I think I'm missing the vocabulary for what I'm trying to do here.)


Solution

  • SELECT done, count(*) FILTER (WHERE step) OVER (ORDER BY done) AS grp
    FROM  (
       SELECT done
            , lag(done) OVER (ORDER BY done) <= done - interval '2 min' AS step
       FROM   tbl
       ) sub
    ORDER  BY done;
    

    The subquery sub returns step = true if the previous row is at least 2 min away - sorted by the timestamp column done itself in this case.

    The outer query adds a rolling count of steps, effectively the group number (grp) - combining the aggregate FILTER clause with another window function.

    fiddle

    Related:

    About the aggregate FILTER clause: