Search code examples
sqlsql-servert-sqlwindow-functionssql-server-2022

Replacing window function OVER() with WINDOW clause reference yields different results


While preparing an answer to another question here, I coded up a query that contained multiple window functions having the same OVER(...) clause. Results were as expected.

select ...
    sum(sum(s.price)) over (partition by p.productid
                            order by c.date
                            rows between 6 preceding and current row)
        / nullif(
             sum(count(s.price)) over(partition by p.productid
                                      order by c.date
                                      rows between 6 preceding and current row),
             0)
        as avg7DayPrice
...

When I attempted to replace the duplicate OVER(...) clauses with a shared WINDOW clause reference OVER Last7Days, defined later as WINDOW Last7Days AS (...), the results were unexpectedly different. (See WINDOW clause - new in 2022.)

select ...
    sum(sum(s.price)) over last7days
        / nullif(sum(count(s.price)) over last7days, 0)
        as avg7DayPrice
...
window last7days as (partition by p.productid
                     order by c.date
                     rows between 6 preceding and current row)

Full original query:

-- Inline window over() clauses
-- Results are as expected
with calendar as (
    select min(date) as date, max(date) as endDate
    from sales
    union all
    select dateadd(day, 1, date), endDate
    from calendar
    where date < enddate
),
products as (
    select distinct productid
    from sales
)
select
    p.productid,
    c.date as salesDate, 
    avg(price) as avg1DayPrice,
    sum(sum(s.price)) over (partition by p.productid
                            order by c.date
                            rows between 6 preceding and current row)
        / nullif(
             sum(count(s.price)) over(partition by p.productid
                                      order by c.date
                                      rows between 6 preceding and current row),
             0)
        as avg7DayPrice
from calendar c
cross join products p
left join sales s
    on s.date = c.date
    and s.productid = p.productid
group by p.productid, c.date

Full modified query:

-- Reference to common defined window clause
-- Expect same results, but that is not what I get.
with calendar as (
    select min(date) as date, max(date) as endDate
    from sales
    union all
    select dateadd(day, 1, date), endDate
    from calendar
    where date < enddate
),
products as (
    select distinct productid
    from sales
)
select
    p.productid,
    c.date as salesDate, 
    avg(price) as avg1DayPrice,
    sum(sum(s.price)) over last7days
        / nullif(
             sum(count(s.price)) over last7days,
             0)
        as avg7DayPrice
from calendar c
cross join products p
left join sales s
    on s.date = c.date
    and s.productid = p.productid
group by p.productid, c.date
window last7days as (partition by p.productid
                     order by c.date
                     rows between 6 preceding and current row)

Sales table:

productId date price
1 2025-02-01 10.00
1 2025-02-02 20.00
1 2025-02-02 30.00
1 2025-02-03 40.00

Original results:

productid salesDate avg1DayPrice avg7DayPrice
1 2025-02-01 10.000000 10.000000
1 2025-02-02 25.000000 20.000000
1 2025-02-03 40.000000 25.000000

Modified results:

productid salesDate avg1DayPrice avg7DayPrice
1 2025-02-01 10.000000 2.500000
1 2025-02-02 25.000000 15.000000
1 2025-02-03 40.000000 25.000000

See this db<>fiddle.

I do not understand why the results are different. It appears that the sum(count(s.price)) over(...) part of the calculation is 1, 3, 4 for the first query, but 4, 4, 4 in the second. The execution plan is also significantly different. Can anyone explain this?


Solution

  • Can anyone explain this?

    This is certainly a bug (reported here), a simpler repro is

    SELECT grp, 
           val,
           sum(val) over wf as input_to_nullif,
           nullif(sum(val) over wf, 30) as wrong_result,
           IIF(sum(val) over wf = 30, NULL, sum(val) over ()) as faulty_expansion,
           IIF(sum(val) over wf = 30, NULL, sum(val) over wf) as correct_expansion
    FROM (VALUES
    (1, 1),
    (1, 3),
    (2, 10),
    (2, 20),
    (2, 20)
    ) V(grp, val)
    window wf as (partition by grp order by val rows unbounded preceding);
    

    which returns

    grp val input_to_nullif wrong_result faulty_expansion correct_expansion
    1 1 1 54 54 1
    1 3 4 54 54 4
    2 10 10 54 54 10
    2 20 30 NULL NULL NULL
    2 20 50 54 54 50

    The problem is that NULLIF gets expanded out to an expression where the first argument appears twice and the window frame is only correctly preserved for the first reference.

    i.e. instead of getting expanded out per the correct_expansion above it gets expanded out to the faulty_expansion variant.

    Running the below...

    
    SELECT 
           NULLIF(sum(val) over wf,-1) as wrong_result_nullif
    FROM (VALUES
    (1, 1),
    (1, 3)
    ) V(grp, val)
    window wf as (partition by grp order by val)
    option (RECOMPILE, querytraceon 3604,querytraceon 8605);
    

    shows this in part of the output (no window frame after the second stopSum)

                AncOp_PrjList 
    
                    AncOp_PrjEl COL: Expr1006 
    
                        ScaOp_AggFunc stopSum partition[CALC:COL: Union1004  TI(int,ML=4)] order[CALC:COL: Union1005  TI(int,ML=4) ASC] range between unbounded and current
    
                            ScaOp_Identifier COL: Union1005 
    
                    AncOp_PrjEl COL: Expr1007 
    
                        ScaOp_AggFunc stopSum
    
                            ScaOp_Identifier COL: Union1005