Search code examples
sqldateamazon-redshiftrecursive-query

Create a new row for each day between dates from two different columns in Redshift SQL


I am working with a table in Redshift. It contains rows with some information and two dates (a start date and an end date). I can't seem to be able to figure out a way to create a new row for each day between the start and end date. For example, here the code to make a simple table:

CREATE TEMPORARY TABLE dates (name VARCHAR(50), start_date DATETIME, end_date DATETIME)

INSERT INTO dates
VALUES
       ('Peter F.','2018-03-01','2018-03-05'),
       ('Sam R.','2018-04-17', '2018-04-20');

SELECT * FROM dates;

How do I go from

name          | start_date                | end_date                   | 
Peter F.      | 2018-04-17 00:00:00.000000| 2018-04-20 00:00:00.000000 |
Sam R.        | 2018-03-01 00:00:00.000000| 2018-03-05 00:00:00.000000 |

to:

name          | start_date                | end_date                   | 
Peter F.      | 2018-04-17 00:00:00.000000| 2018-04-20 00:00:00.000000 |
Peter F.      | 2018-04-18 00:00:00.000000| 2018-04-20 00:00:00.000000 |
Peter F.      | 2018-04-19 00:00:00.000000| 2018-04-20 00:00:00.000000 |
Peter F.      | 2018-04-20 00:00:00.000000| 2018-04-20 00:00:00.000000 |
Sam R.        | 2018-03-01 00:00:00.000000| 2018-03-05 00:00:00.000000 |
Sam R.        | 2018-03-02 00:00:00.000000| 2018-03-05 00:00:00.000000 |
Sam R.        | 2018-03-03 00:00:00.000000| 2018-03-05 00:00:00.000000 |
Sam R.        | 2018-03-04 00:00:00.000000| 2018-03-05 00:00:00.000000 |
Sam R.        | 2018-03-05 00:00:00.000000| 2018-03-05 00:00:00.000000 |

I tried using a recursive CTE like this:

WITH cte
     AS (SELECT name,
                start_date,
                end_date
         FROM   dates
         UNION ALL
         SELECT name,
                Dateadd(day, 1, start_date),
                end_date
         FROM   cte
         WHERE  start_date < end_date)
SELECT *
FROM   cte

But it gives me an error:

[Amazon](500310) Invalid operation: relation "cte" does not exist;

Hoping for some expert advice, as I'd really like to be able to do this in SQL... Thank you in advance!


Solution

  • Redshift does not support recursive common table expressions.

    One approach is to build a table of numbers:

    create table nums(n int);
    insert into nums values(0), (1), (2), (3), ...
    

    You can then join the numbers table with the original table to generate the expected result:

    select
        d.name,
        dateadd(day, n.n, d.start_date) start_date,
        d.end_date
    from dates d
    inner join nums n 
        on dateadd(day, n.n, d.start_date) <= d.end_date
    

    You can also list the numbers as a derived table directly in your query, or use row_number() against a large table.