Search code examples
sql-servert-sqlwindow-functions

Different results for COUNT using as window function with and without ORDER BY


Say, I have this simple table:

declare @t table ([period] date, [contract] int, price int);
insert into @t values
('2022-11-01', 1, 234),
('2022-12-01', 1, 277),
('2023-01-01', 1, 489),
('2022-11-01', 2, 187),
('2022-12-01', 2, 598),
('2022-03-01', 2, 478);

When using it as window function with and without ORDER BY, I get different results. When I used ORDER BY, I was surprised with the results since I expected all values in count column to be 3.

select     *,
       count = count(*) over (partition by [contract] order by [period])
from @t;
period contract price count
2022-11-01 1 234 1
2022-12-01 1 277 2
2023-01-01 1 489 3
2022-11-01 2 478 1
2022-12-01 2 187 2
2023-01-01 2 598 3

I couldn't understand, why this was happening. Then I decided to remove order by [period] clause. And then all worked like a charm!

select     *,
       count = count(*) over (partition by [contract])
from @t;
period contract price count
2022-11-01 1 234 3
2022-12-01 1 277 3
2023-01-01 1 489 3
2022-11-01 2 478 3
2022-12-01 2 187 3
2023-01-01 2 598 3

After that I decided to take a look at the documentation, and it says:

The order_by_clause determines the logical order of the operation. See OVER clause (Transact-SQL) for more information.

And that OVER link says:

ROWS/RANGE that limits the rows within the partition by specifying start and end points within the partition. It requires ORDER BY argument and the default value is from the start of partition to the current element if the ORDER BY argument is specified.

Do I correctly understand that as soon as I specify ORDER BY, it acts like BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW? Then when I omit it, why it acts as BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING?


Solution

  • The documentation is quite clear on this further down the same page.

    ORDER BY

    ORDER BY *order_by_expression* [COLLATE *collation_name*] [ASC|DESC]  
    

    Defines the logical order of the rows within each partition of the result set. That is, it specifies the logical order in which the window function calculation is performed.

    • If it is not specified, the default order is ASC and window function will use all rows in partition.
    • If it is specified, and a ROWS/RANGE is not specified, then default RANGE UNBOUNDED PRECEDING AND CURRENT ROW is used as default for window frame by the functions that can accept optional ROWS/RANGE specification (for example min or max).

    So by default you get a running sum. This default is only if you specify ORDER BY, otherwise the window calculation is over the whole partition (ie not running).

    Note that ROWS and RANGE are subtly different, and annoyingly 99% of the time you want ROWS for a running sum so make sure to specify it.