Hi I have a table that looks like the following
grouping_coulmn | value | date_modified |
---|---|---|
1 | 5 | 2020-10-15 |
1 | 10 | 2020-10-20 |
2 | 3 | 2020-10-20 |
1 | 11 | 2020-11-30 |
1 | 11 | 2020-12-10 |
1 | 5 | 2020-12-15 |
How could I make a query that returns the following results
grouping_column | last_value_of_month | month |
---|---|---|
1 | 10 | OCT 2020 |
1 | 11 | NOV 2020 |
1 | 5 | DIC 2020 |
1 | 5 | JAN 2021 |
2 | 3 | OCT 2020 |
2 | 3 | NOV 2020 |
2 | 3 | DIC 2020 |
2 | 3 | JAN 2021 |
In other words it should return the last value of the group each month, from the first entry until the current month. I could work it out if you don't fill the missing months, but I don't know how to work that out.
NOTE: this question was asked on January 2021, just for context.
First, generate all the months based on the oldest date in the table:
with months as (
select ddate + interval '1 month' as end_date,
to_char(ddate, 'MON YYYY') as month
from generate_series(
date_trunc(
'month',
(select min(date_modified) from table1)
),
now(),
interval '1 month'
) as gs(ddate)
)
Join that back to your data table, and use distinct on
to limit the result to one record per (grouping_column, month)
:
select distinct on (t.grouping_column, m.end_date)
t.grouping_column, t.value as last_value_of_month, m.month
from months m
join table1 t
on t.date_modified < m.end_date
order by t.grouping_column, m.end_date, t.date_modified desc;
Result:
grouping_column | last_value_of_month | month
--------------: | ------------------: | :-------
1 | 10 | OCT 2020
1 | 11 | NOV 2020
1 | 5 | DEC 2020
1 | 5 | JAN 2021
2 | 3 | OCT 2020
2 | 3 | NOV 2020
2 | 3 | DEC 2020
2 | 3 | JAN 2021
db<>fiddle here