Select * but group by similar timestamps (not in fixed intervals)

I've seen a lot of grouping by timestamps with fixed intervals (e.g. 0-10, 11-20 mins) etc. However I want to ignore duplicate date in a database which have Similar timestamps, that in theory could span fixed width intervals e.g. 10:09 and 10:11. So I want the intervals to be dynamic and relative to each row.

What's the best approach to show all the "unique" entries, grabbing only the first entry where the timestamps are +/- 5 minutes on the others.

Thanks

Solution

This is a gaps and islands problem. Let's start by creating a table of test values and populating it:

CREATE TABLE test_values AS (
SELECT id, started_at::TIMESTAMP AS started_at
      FROM (VALUES (1, '2024-07-01 00:00:00'),
                   (2, '2024-07-01 00:10:00'),
                   (3, '2024-07-01 00:15:00'),
                   (4, '2024-07-01 00:19:00'),
                   (5, '2024-07-01 00:25:00')) v (id, started_at));

The following query demonstrates an approach to finding the starting times of each cluster:

WITH
  params AS (SELECT INTERVAL '5' MINUTE AS spread),
  ng AS (
    SELECT tv.id,
           tv.started_at,
           tv.started_at - LAG(tv.started_at) OVER (ORDER BY tv.started_at) > params.spread IS NOT FALSE AS new_group
      FROM params
      CROSS JOIN test_values tv)
SELECT ng.*
  FROM ng
  WHERE ng.new_group
  ORDER BY ng.started_at;

The query works by identifying each started_at that neither overlaps nor is contiguous with the most recent earlier time plus spread. The comparison is checked with IS NOT FALSE so that the first time will be identified as starting a group:

id	started_at	new_group
1	2024-07-01 00:00:00	true
2	2024-07-01 00:10:00	true
5	2024-07-01 00:25:00	true

If > is changed to >=, then clusters will consist of times that are within spread of adjacent times; i.e, adjacent times that are exactly spread apart will be in different clusters:

id	started_at	new_group
1	2024-07-01 00:00:00	true
2	2024-07-01 00:10:00	true
3	2024-07-01 00:15:00	true
5	2024-07-01 00:25:00	true