Search code examples
sqlsql-server

SQL Server applying ORDER BY .. OFFSET only to UNION subquery


In SQL Server 2022 (and earlier), this behaves as I'd expect, according to the SQL standard:

select i
from (values (1)) as t (i)
union 
select i
from (values (2), (3)) as t (i)
order by i
offset 0 rows
fetch next 1 rows only

It produces:

|i  |
|---|
|1  |

So, like everywhere else, the ORDER BY .. OFFSET clause is applied to the result of the unions. However, when I nest the above query in a derived table or CTE, then the semantics seems to change (assume the actual query is much more complex):

select *
from (
  select i
  from (values (1)) as t (i)
  union 
  select i
  from (values (2), (3)) as t (i)
  order by i
  offset 0 rows
  fetch next 1 rows only
) t;

This produces:

|i  |
|---|
|1  |
|2  |

Now, the ORDER BY .. OFFSET clause is applied to the UNION's second subquery only. Is this specified behaviour? I cannot seem to find authoritative information in the ORDER BY docs, which read:

In a query that uses UNION, EXCEPT, or INTERSECT operators, ORDER BY is allowed only at the end of the statement. This restriction applies only to when you specify UNION, EXCEPT, and INTERSECT in a top-level query and not in a subquery. See the Examples section that follows.

But there's no example showing this particular case. The SELECT docs's syntax diagrams don't really support ORDER BY in UNION subqueries, only in top-level <SELECT statement>, so that documentation seems outdated / incomplete as well.

Is this a well known bug?


Solution

  • Bug or not

    It's unlikely a bug, just an interesting design choice of how SELECT works as a top level query (similar to standard SQL) vs how it works as a subquery. This example shows that all UNION subqueries can have "local" ORDER BY .. OFFSET clauses, and the trailing clause isn't a global one over the UNION output:

    select *
    from (
      select i
      from (values (1), (2)) as t (i)
      order by i
      offset 0 rows
      fetch next 1 rows only
      union 
      select i
      from (values (3), (4)) as t (i)
      order by i
      offset 0 rows
      fetch next 1 rows only
    ) t;
    

    This produces:

    |i  |
    |---|
    |1  |
    |3  |
    

    So, this hints at there simply being missing / inconsistent documentation.

    If the subquery is run as a top-level query, then this error is raised:

    SQL Error [156] [S0001]: Incorrect syntax near the keyword 'union'.

    Workaround

    Just nest the set operation once more in a derived table:

    select *
    from (
      select *
      from (
        select i
        from (values (1)) as t (i)
        union 
        select i
        from (values (2), (3)) as t (i)
      ) t
      order by i
      offset 0 rows
      fetch next 1 rows only
    ) t;
    

    Now, the ORDER BY .. OFFSET clause behaves again, and applies to the complete UNION output, producing the expected result:

    |i  |
    |---|
    |1  |