My source data is following
declare @dim as table
(
name varchar(10),period int
)
insert into @dim
select * from
(
values
('a', 202001),('a', 202002),('a', 202003),('a', 202004),
('a', 202005),('a', 202006),('a', 202007),('a', 202008),
('a', 202009),('a', 202010),('a', 202011),('a', 202012),
('b', 202001),('b', 202002),('b', 202003),('b', 202004),
('b', 202005),('b', 202006),('b', 202007),('b', 202008),
('b', 202009),('b', 202010),('b', 202011),('b', 202012)
) t (one, two)
declare @fact as table
(
name varchar(max),period bigint,val decimal(19, 2)
)
insert into @fact
select * from
(
values
('a', 202002, 100),
('a', 202005, 600),
('a', 202010, 700),
('b', 202004, 500),
('b', 202007, 600),
('b', 20208, 1000)
) t (one, two, three)
I was wondering if it is possible at all to achieve a fill-down inside an outer-apply
. I tried as following but it did not work
select a.name,a.period,x.val, y.FD
from @dim a
outer apply (select * from @fact b where a.name=b.name and a.period=b.period) as x
outer apply (select max(x.val) over (partition by a.name order by a.period ASC ROWS UNBOUNDED PRECEDING)) as y (FD)
The reason why I am keen on achieving the fill down inside outer-apply
because I can keep on utilizing that filled down column from outer-apply
in the same query in succeeding apply
to create further calculated columns or transformation without the help of any temp table as following
select a.name,a.period,x.val, y.FD, z.bucket
from @dim a
outer apply (select * from @fact b where a.name=b.name and a.period=b.period) as x
outer apply (successful fill down) as y (FD)
outer apply (VALUES(CASE WHEN FD>1000 then 'bucket1' else 'bucket2' end) as z(bucket)
My desired result
name | period | val | FD |
---|---|---|---|
a | 202001 | NULL | NULL |
a | 202002 | 100.00 | 100.00 |
a | 202003 | NULL | 100.00 |
a | 202004 | NULL | 100.00 |
a | 202005 | 600.00 | 600.00 |
a | 202006 | NULL | 600.00 |
a | 202007 | NULL | 600.00 |
a | 202008 | NULL | 600.00 |
a | 202009 | NULL | 600.00 |
a | 202010 | 700.00 | 700.00 |
a | 202011 | NULL | 700.00 |
a | 202012 | NULL | 700.00 |
b | 202001 | NULL | NULL |
b | 202002 | NULL | NULL |
b | 202003 | NULL | NULL |
b | 202004 | 500.00 | 500.00 |
b | 202005 | NULL | 500.00 |
b | 202006 | NULL | 500.00 |
b | 202007 | 600.00 | 600.00 |
b | 202008 | NULL | 600.00 |
b | 202009 | NULL | 600.00 |
b | 202010 | NULL | 600.00 |
b | 202011 | NULL | 600.00 |
b | 202012 | NULL | 600.00 |
Which can be achieved by following but outside of an outer-apply
which I am not looking for.
select a.name,a.period,x.val, MAX(x.val) over (partition by a.name order by a.period ASC ROWS UNBOUNDED PRECEDING) as FD
from @dim a
outer apply (select * from @fact b where a.name=b.name and a.period=b.period) as x
Your problem is that you are putting the MAX
window function inside an APPLY
.
When you use an APPLY
, the whole subquery is (logically) evaluated for each row. So the subquery, which only contains a select
, only has one row, and that is evaluated per row of the outer table.
You cannot use window functions in this way. Window functions are only (logically) evaluated after joining and grouping, in the select
phase and before ordering. The APPLY
comes in the join phase, which is earlier.
Note that your first
APPLY
can be rewritten as a simpleLEFT JOIN
.
select
a.name,
a.period,
x.val,
FD = max(x.val) over
(partition by a.name
order by a.period ROWS UNBOUNDED PRECEDING)
from @dim a
left join @fact x on a.name = x.name and a.period = x.period
If you want the MAX
to be used in other parts of the query, you must place it in a derived table.
select *,
SomeOtherCalculationInvolvingFD
from (
select
a.name,
a.period,
x.val,
FD = max(x.val) over
(partition by a.name
order by a.period ROWS UNBOUNDED PRECEDING)
from @dim a
left join @fact x on a.name = x.name and a.period = x.period
) t
I leave you with one tip: APPLY
is very useful, but don't go round hammering every nail with it. Understand how it works, use appropriately.