Search code examples
sqlpostgresqldatedatetimegenerate-series

How to generate series using start and end date and quarters on postgres


I have a table like shown below where I want to use the start and end date to evenly distribute the value for each row to the 3 months in each quarter to all of the quarters in between start and end date (last two columns).

I am familiar with generate series and intervals in Postgres but I am having hard time to get what I want.

My table has and ID column that groups rows together, a quarter column that indicates which quarter the row references for the ID, a value column that is the value for the whole quarter (and every quarter in the date range), and start_date and end_date columns indicating the date range. Here is a sample:

ID  quarter     value   start_date  end_date 
1   2            152    2019-11-07  2050-12-30
1   1            785    2019-11-07  2050-12-30
2   2            152    2019-03-05  2050-12-30
2   1            785    2019-03-05  2050-12-30
3   4            41     2018-06-12  2050-12-30
3   3            50     2018-06-12  2050-12-30
3   2            88     2018-06-12  2050-12-30
3   1            29     2018-06-12  2050-12-30
4   2           1607    2018-12-17  2050-12-30
4   1           4803    2018-12-17  2050-12-30

Here is my desired output (for ID 1):

ID  quarter     value   start_date  end_date 
1   2            152/3  2020-04-01  2020-07-01
1   1            785/3  2020-01-01  2020-04-01
1   2            152/3  2021-04-01  2021-07-01
1   1            785/3  2021-01-01  2021-04-01

start_date in the output will be the next quarter on first table. I need the series to be generated from the start_date to the end_date of the first table.


Solution

  • You can do this by using the GENERATE_SERIES function and passing in the start and end date for each unique (by ID) row and setting the interval to 3 months. Then join the result back with your original table on both ID and quarter.

    Here's an example (note original_data is what I've called your first table):

    WITH
    quarters_table AS (
        SELECT
            t.ID,
            (EXTRACT('month' FROM t.quarter_date) - 1)::INT / 3 + 1 AS quarter,
            t.quarter_date::DATE AS start_date,
            COALESCE(
                LEAD(t.quarter_date) OVER (),
                DATE_TRUNC('quarter', t.original_end_date) + INTERVAL '3 months'
            )::DATE AS end_date
        FROM (
            SELECT
                original_record.ID,
                original_record.end_date AS original_end_date,
                GENERATE_SERIES(
                    DATE_TRUNC('quarter', original_record.start_date),
                    DATE_TRUNC('quarter', original_record.end_date),
                    INTERVAL '3 months'
                ) AS quarter_date
            FROM (
                SELECT DISTINCT ON (original_data.ID)
                    original_data.ID,
                    original_data.start_date,
                    original_data.end_date
                FROM
                    original_data
                ORDER BY
                    original_data.ID
            ) AS original_record
        ) AS t
    )
    
    SELECT
        quarters_table.ID,
        quarters_table.quarter,
        original_data.value::DOUBLE PRECISION / 3 AS value,
        quarters_table.start_date,
        quarters_table.end_date
    FROM
        quarters_table
    INNER JOIN
        original_data
    ON
        quarters_table.ID = original_data.ID
        AND quarters_table.quarter = original_data.quarter;
    

    Sample output:

     id | quarter |      value       | start_date |  end_date  
    ----+---------+------------------+------------+------------
      1 |       1 | 261.666666666667 | 2020-01-01 | 2020-04-01
      1 |       2 | 50.6666666666667 | 2020-04-01 | 2020-07-01
      1 |       1 | 261.666666666667 | 2021-01-01 | 2021-04-01
      1 |       2 | 50.6666666666667 | 2021-04-01 | 2021-07-01
    

    For completeness, here's the original_data table I've used in testing:

    WITH
    original_data AS (
        SELECT
            1 AS ID,
            2 AS quarter,
            152 AS value,
            '2019-11-07'::DATE AS start_date,
            '2050-12-30'::DATE AS end_date
        
        UNION ALL
    
        SELECT
            1 AS ID,
            1 AS quarter,
            785 AS value,
            '2019-11-07'::DATE AS start_date,
            '2050-12-30'::DATE AS end_date
        
        UNION ALL
    
        SELECT
            2 AS ID,
            2 AS quarter,
            152 AS value,
            '2019-03-05'::DATE AS start_date,
            '2050-12-30'::DATE AS end_date
        
        -- ...
    )