Search code examples
sqlpostgresqldate

Postgres - Query distinct/longest date ranges


I'm looking to apply some logic that returns the distinct/longest date ranges in a particular table. In the example below, PKID 1 contains the date ranges in 2-4, then is proceeded by 5. I can't think of a good way to return 1 and 5 without writing multiple MIN() and MAX() aggregates, comparing lengths of time, or writing a 10 line CASE statement with LAG() functions. Is there an easier way to return PKID 1 and 5 without doing 20 levels of logic tests?

I've found a few posts that come close to this problem but aren't the same. I can't think of a way to implement the OVERLAP function in this instance, but enjoyed learning about it.

Find overlapping date ranges in PostgreSQL

How to find overlapping date ranges in Postgresql between multiple rows?

Postgresql query between date ranges

Sample data with the desired rows to be returned highlighted


Solution

  • Do you mean this?

    WITH
    indata(pkid,start_dt,end_dt) AS (
                SELECT 1,DATE '2020-06-29',DATE '2023-04-30'
      UNION ALL SELECT 2,DATE '2020-06-29',DATE '2021-06-27'
      UNION ALL SELECT 3,DATE '2022-03-07',DATE '2022-03-07'
      UNION ALL SELECT 4,DATE '2022-03-14',DATE '2022-03-14'
      UNION ALL SELECT 5,DATE '2023-05-01',DATE '2025-08-06'
    )
    SELECT
      *
    FROM indata
    ORDER BY end_dt - start_dt LIMIT 2
    ;
    
    pkid start_dt end_dt
    1 2020-06-29 2023-04-30
    5 2023-05-01 2025-08-06

    Or this?

    CREATE LOCAL TEMPORARY TABLE 
    indata(pkid,start_dt,end_dt)
    ON COMMIT PRESERVE ROWS  AS (
                SELECT 1,DATE '2020-06-29',DATE '2023-04-30'
      UNION ALL SELECT 2,DATE '2020-06-29',DATE '2021-06-27'
      UNION ALL SELECT 3,DATE '2022-03-07',DATE '2022-03-07'
      UNION ALL SELECT 4,DATE '2022-03-14',DATE '2022-03-14'
      UNION ALL SELECT 5,DATE '2023-05-01',DATE '2025-08-06'
    );
    
    WITH RECURSIVE rec_query(pkid,start_dt,end_dt) AS (
      SELECT
        *
      FROM indata
      WHERE pkid = 1
      UNION ALL
      SELECT
        next.*
      FROM indata AS next
      JOIN rec_query AS prev
        ON prev.end_dt + 1 = next.start_dt
    )
    SELECT * FROM rec_query;
    
    pkid start_dt end_dt
    1 2020-06-29 2023-04-30
    5 2023-05-01 2025-08-06