I am working with a table in Redshift. It contains rows with some information and two dates (a start date and an end date). I can't seem to be able to figure out a way to create a new row for each day between the start and end date. For example, here the code to make a simple table:
CREATE TEMPORARY TABLE dates (name VARCHAR(50), start_date DATETIME, end_date DATETIME)
INSERT INTO dates
VALUES
('Peter F.','2018-03-01','2018-03-05'),
('Sam R.','2018-04-17', '2018-04-20');
SELECT * FROM dates;
How do I go from
name | start_date | end_date |
Peter F. | 2018-04-17 00:00:00.000000| 2018-04-20 00:00:00.000000 |
Sam R. | 2018-03-01 00:00:00.000000| 2018-03-05 00:00:00.000000 |
to:
name | start_date | end_date |
Peter F. | 2018-04-17 00:00:00.000000| 2018-04-20 00:00:00.000000 |
Peter F. | 2018-04-18 00:00:00.000000| 2018-04-20 00:00:00.000000 |
Peter F. | 2018-04-19 00:00:00.000000| 2018-04-20 00:00:00.000000 |
Peter F. | 2018-04-20 00:00:00.000000| 2018-04-20 00:00:00.000000 |
Sam R. | 2018-03-01 00:00:00.000000| 2018-03-05 00:00:00.000000 |
Sam R. | 2018-03-02 00:00:00.000000| 2018-03-05 00:00:00.000000 |
Sam R. | 2018-03-03 00:00:00.000000| 2018-03-05 00:00:00.000000 |
Sam R. | 2018-03-04 00:00:00.000000| 2018-03-05 00:00:00.000000 |
Sam R. | 2018-03-05 00:00:00.000000| 2018-03-05 00:00:00.000000 |
I tried using a recursive CTE like this:
WITH cte
AS (SELECT name,
start_date,
end_date
FROM dates
UNION ALL
SELECT name,
Dateadd(day, 1, start_date),
end_date
FROM cte
WHERE start_date < end_date)
SELECT *
FROM cte
But it gives me an error:
[Amazon](500310) Invalid operation: relation "cte" does not exist;
Hoping for some expert advice, as I'd really like to be able to do this in SQL... Thank you in advance!
Redshift does not support recursive common table expressions.
One approach is to build a table of numbers:
create table nums(n int);
insert into nums values(0), (1), (2), (3), ...
You can then join the numbers table with the original table to generate the expected result:
select
d.name,
dateadd(day, n.n, d.start_date) start_date,
d.end_date
from dates d
inner join nums n
on dateadd(day, n.n, d.start_date) <= d.end_date
You can also list the numbers as a derived table directly in your query, or use row_number()
against a large table.