I want to create a row for every month between two dates, the first day of every month should be the day of the start date or the first day of every month, and the last date should be the last day of every month or the end date, with average (if date start = 15, then the average should be 15/30) for my table.
input :
product_id | date_start | date_end
1 | 16-01-2020 | 15-03-2020
2 | 07-01-2020 | 22-04-2020
The result should be :
product_id | date_start | date_end | average
1 | 16-01-2020 | 31-01-2020 | 0.5
1 | 01-02-2020 | 29-02-2020 | 1
1 | 01-03-2020 | 15-03-2020 | 0.5
2 | 07-01-2020 | 31-01-2020 | 0.76 -- (30-07)/30
2 | 01-02-2020 | 29-02-2020 | 1
2 | 01-03-2020 | 31-03-2020 | 1
2 | 01-04-2020 | 22-04-2020 | 0.76
I tried using generate series and date trunc and union
SELECT (date_trunc('month', dt) + INTERVAL '1 MONTH' ):: DATE AS date_start ,
(date_trunc('month', dt) + INTERVAL '2 MONTH - 1 day' ):: DATE AS date_end
FROM generate_series( DATE '2020-01-15', DATE '2020-05-21', interval '1 MONTH' ) AS dt
union select '2020-01-15' as date_start,
(date_trunc('month', '2020-01-15'::date) + INTERVAL '1 MONTH - 1 day' ):: DATE AS date_end
union select (date_trunc('month', '2020-05-21'::date) ):: DATE AS date_start ,
'2020-05-21' AS date_end
order by date_start
To adding average I calculate the difference between two dates
SELECT (date_trunc('month', dt) + INTERVAL '1 MONTH' ):: DATE AS date_start ,
(date_trunc('month', dt) + INTERVAL '2 MONTH - 1 day' ):: DATE AS date_end,
((date_trunc('month', dt) + INTERVAL '2 MONTH - 1 day' ) - (date_trunc('month', dt) + INTERVAL '1 MONTH' ):: DATE )
FROM generate_series( DATE '2020-01-15', DATE '2020-05-21', interval '1 MONTH' ) AS dt
with this it seemed like I was hit a wall.
The following gives approximately the same result as you desired, only averages deviates. I believe this stems from an inconsistency in the your calculations where the dates are inclusive in some and excludes either start or end date in others, I was inclusive in all. The other area of difference being I used the actual number of days in the month for denominator calculating it instead of 30. This is necessary for Feb to ever have average 1, otherwise max would be 0.97, and full months having 31 days would average 1.03.
with product_dates(product_id, date_start, date_end) as
( values (1,'2020-01-16'::date,'2020-03-15'::date)
, (2,'2020-01-07'::date,'2020-04-22'::date)
)
select product_id, start_date, end_date, round((end_date-start_date+1 ) * 1.0 / (eom-som+1),2) average
from (select product_id
, greatest(date_start,dt::date) start_date
, least(date_end, (dt+interval '1 month' -interval '1 day')::date) end_date
, dt::date som
, (dt+interval '1 month' -interval '1 day')::date eom
from product_dates
cross join generate_series(date_trunc('month', date_start)
,date_trunc('month', date_end) + interval '1 month' - interval '1 day'
,interval '1 month'
) gs(dt)
) s1;
The heart is the generate_series working directly with dates, notice the date manipulation to ensure I had first day and last day of month. Then in the outer portion of the quest I selected those dates or the parameter date or the generated one (greatest and least functions),