I have a table like this where TIMELINE is just an overall series of dates in order and DATE that I've matched to the dates in TIMELINE and contains actual records of dates that have values (shown in the NUMBER column)
TIMELINE | DATE | NUMBER |
---|---|---|
2022-03-03 | 2022-03-03 | NULL |
2022-03-04 | 2022-03-04 | 40 |
2022-03-07 | NULL | NULL |
2022-03-08 | NULL | NULL |
2022-06-08 | 2022-06-08 | 45 |
2022-06-28 | 2022-06-28 | NULL |
2022-06-29 | NULL | NULL |
2022-06-30 | NULL | NULL |
2022-07-08 | 2022-07-08 | 80 |
I am trying to fill in this table so that for every DATE that is NULL, its NUMBER becomes the value of the most recent/max DATE that is <= to the current date of the row I am looking at. So for example, in the third row, the DATE is NULL so its NUMBER would become the value (40) that is associated with the date 2022-03-04 and the same thing for fourth row. Then for the third to last and the second to last row, its NUMBER would be NULL because that is the value associated with the most recent date before that row.
I am trying to get an output like this:
TIMELINE | DATE | NUMBER |
---|---|---|
2022-03-03 | 2022-03-03 | NULL |
2022-03-04 | 2022-03-04 | 40 |
2022-03-07 | NULL | 40 |
2022-03-08 | NULL | 40 |
2022-06-08 | 2022-06-08 | 45 |
2022-06-28 | 2022-06-28 | NULL |
2022-06-29 | NULL | NULL |
2022-06-30 | NULL | NULL |
2022-07-08 | 2022-07-08 | 80 |
Would this be achieved by joining the table onto itself?/Is there a function that would help me do this? Thanks!
Here's an example of the update query that would produce the desired results. Although I've used your reserved words, I would suggest avoiding those in your modeling.
--
-- Create test table
--
create
or replace table x (timeline date, date date, number integer);
--
-- load test data
--
insert into
x
values
('2022-03-03'::date,'2022-03-03'::date,NULL), ('2022-03-04'::date,'2022-03-04'::date,40), ('2022-03-07'::date,NULL,NULL), ('2022-03-08'::date,NULL,NULL), ('2022-06-08'::date,'2022-06-08'::date,45),('2022-06-28'::date,'2022-06-28'::date,NULL), ('2022-06-29'::date,NULL,NULL), ('2022-06-30'::date,NULL,NULL), ('2022-07-08'::date,'2022-07-08'::date,80);
--
-- Update the original table via a correlation
--
update
x
set
x.number = z.number
from
(
select
a.timeline,
a.date,
case
when (a.date is NULL) then b.number
else a.number
end number
from
x a
left join x b on b.timeline = (
select
max(timeline)
from
x
where
timeline < a.timeline
and date is not null
)
) z
where
x.timeline = z.timeline
and x.date = z.date;
--
-- Results
--
select * from x order by 1;
Results:
TIMELINE DATE NUMBER
2022-03-03 2022-03-03
2022-03-04 2022-03-04 40
2022-03-07 40
2022-03-08 40
2022-06-08 2022-06-08 45
2022-06-28 2022-06-28
2022-06-29
2022-06-30
2022-07-08 2022-07-08 80