I need to fill missing date values in orders table. DDL:
create table orders(order_date date, order_value int)
insert into orders values('2022-11-01',100),('2022-11-04 ',200),('2022-11-08',300)
Expected output is as:
order_date | order_value
-----------------------
2022-11-01 | 100
2022-11-02 | 100
2022-11-03 | 100
2022-11-04 | 200
2022-11-05 | 200
2022-11-06 | 200
2022-11-07 | 200
2022-11-08 | 300
I have solved the problem in ms sql using recursive query listed below.
with cte as (
select min(order_date) [min_date], MAX(order_date) [max_date]
FROM orders
), cte2 AS(
SELECT min_date [date]
FROM cte
UNION ALL
SELECT dateadd(day,1,date) [date]
FROM cte2
WHERE date < (SELECT max_date FROM cte)
), cte3 as(
select date [order_date], order_value
FROM cte2
LEFT JOIN orders on date = order_date
)
SELECT order_date,
FIRST_VALUE(order_value) IGNORE NULLS
OVER(ORDER BY order_date desc ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) [order_value]
FROM cte3
Is there any alternate approach to solve this problem or any way to optimize the recursive query?
Thanks in advance.
As mentioned in the comments, as you are on SQL Server 2022 you can make use of GENERATE_SERIES
here, which will be much more performant that a recursive Common Table Expression (rCTE). On prior versions, you can use an (inline) tally or a Calendar table.
WITH Dates AS(
SELECT MIN(order_date) AS MinDate,
MAX(order_date) AS MaxDate
FROM dbo.orders),
DateSeries AS(
SELECT DATEADD(DAY,GS.value,D.MinDate) AS Date
FROM Dates D
CROSS APPLY GENERATE_SERIES(0,DATEDIFF(DAY,MinDate,MaxDate),1) GS)
SELECT DS.Date,
FIRST_VALUE(O.order_value) IGNORE NULLS OVER (ORDER BY DS.Date desc ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS [order_value]
FROM DateSeries DS
LEFT JOIN dbo.orders O ON DS.Date = O.order_date
ORDER BY DS.Date;