I have the following table:
CREATE TABLE IF NOT EXISTS test (
date DATE,
value INT
);
INSERT INTO test VALUES
( '2022-01-11', NULL), -- Tues
( '2022-01-10', 5), -- Mon
( '2022-01-09', NULL), -- Sun
( '2022-01-08', 6), -- Sat
( '2022-01-07', NULL), -- Fri
( '2022-01-06', NULL), -- Thur
( '2022-01-05', 5), -- Wed
( '2022-01-04', NULL), -- Tues
( '2022-01-03', NULL), --Mon
( '2022-01-02', NULL), -- Sun
( '2022-01-01', NULL); -- Sat
What is a query that forward and backfills the value column, for calendar week (Monday through Sunday)
For a calendar week should forward fill null values up to the next non-null value, and backfill null values up to the previous non-null value. If the calendar week has all null values then the week should be null
The result should look like this:
date | VALUES |
---|---|
'2021-01-11' | 5 -- Tues, filled from Monday |
'2022-01-10' | 5 -- Mon |
'2022-01-09' | 6 -- Sun Filled from saturday |
'2022-01-08' | 6 -- Sat |
'2022-01-07' | 5 -- Fri Filled from Wed |
'2022-01-06' | 5 -- Thur Filled from Wed |
'2022-01-05' | 5 -- Wed |
'2022-01-04' | 5 -- Tues Filled from Wed |
'2022-01-03' | 5 --Mon Filled from Wed |
'2022-01-02' | Null -- Sun |
'2022-01-01' | Null -- Sat No values in week, therefor null |
That would have been a good spot for last_value
with option ignore nulls
, but unfortunately Postgres does not support the latter.
Here is one way to do it in Postgres with window functions:
select date, value,
max(value) over(partition by date_trunc('week', date), grp) new_value
from (
select t.*,
count(value) over(partition by date_trunc('week', date) order by date) grp
from test t
) t
order by date desc
The approach is to build groups of consecutive rows of the same week that contain a non-null value, possibly followed by null values, using a window count
, and then bring the latest previous non-null value. You can run the subquery separately to see how groups are built.
If you also want the query to look for the next value when no previous value is availability in the same week, we can just expand the logic and define two groups for each row, then try both:
select date, value,
coalesce(
max(value) over(partition by date_trunc('week', date), grp1),
max(value) over(partition by date_trunc('week', date), grp2)
) new_value
from (
select t.*,
count(value) over(partition by date_trunc('week', date) order by date ) grp1,
count(value) over(partition by date_trunc('week', date) order by date desc) grp2
from test t
) t
order by date desc