Say, I have this simple table:
declare @t table ([period] date, [contract] int, price int);
insert into @t values
('2022-11-01', 1, 234),
('2022-12-01', 1, 277),
('2023-01-01', 1, 489),
('2022-11-01', 2, 187),
('2022-12-01', 2, 598),
('2022-03-01', 2, 478);
When using it as window function with and without ORDER BY
, I get different results.
When I used ORDER BY
, I was surprised with the results since I expected all values in count
column to be 3
.
select *,
count = count(*) over (partition by [contract] order by [period])
from @t;
period | contract | price | count |
---|---|---|---|
2022-11-01 | 1 | 234 | 1 |
2022-12-01 | 1 | 277 | 2 |
2023-01-01 | 1 | 489 | 3 |
2022-11-01 | 2 | 478 | 1 |
2022-12-01 | 2 | 187 | 2 |
2023-01-01 | 2 | 598 | 3 |
I couldn't understand, why this was happening. Then I decided to remove order by [period]
clause. And then all worked like a charm!
select *,
count = count(*) over (partition by [contract])
from @t;
period | contract | price | count |
---|---|---|---|
2022-11-01 | 1 | 234 | 3 |
2022-12-01 | 1 | 277 | 3 |
2023-01-01 | 1 | 489 | 3 |
2022-11-01 | 2 | 478 | 3 |
2022-12-01 | 2 | 187 | 3 |
2023-01-01 | 2 | 598 | 3 |
After that I decided to take a look at the documentation, and it says:
The order_by_clause determines the logical order of the operation. See OVER clause (Transact-SQL) for more information.
And that OVER
link says:
ROWS/RANGE
that limits the rows within the partition by specifying start and end points within the partition. It requires ORDER BY argument and the default value is from the start of partition to the current element if the ORDER BY argument is specified.
Do I correctly understand that as soon as I specify ORDER BY
, it acts like BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
? Then when I omit it, why it acts as BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
?
The documentation is quite clear on this further down the same page.
ORDER BY
ORDER BY *order_by_expression* [COLLATE *collation_name*] [ASC|DESC]
Defines the logical order of the rows within each partition of the result set. That is, it specifies the logical order in which the window function calculation is performed.
- If it is not specified, the default order is ASC and window function will use all rows in partition.
- If it is specified, and a
ROWS
/RANGE
is not specified, then defaultRANGE UNBOUNDED PRECEDING AND CURRENT ROW
is used as default for window frame by the functions that can accept optionalROWS
/RANGE
specification (for example min or max).
So by default you get a running sum. This default is only if you specify ORDER BY
, otherwise the window calculation is over the whole partition (ie not running).
Note that ROWS
and RANGE
are subtly different, and annoyingly 99% of the time you want ROWS
for a running sum so make sure to specify it.