Search code examples
sqlsubquery

SQL - referring to a 'FROM' subquery column


Take this code for example, which selects the highest amount of orders every year:

SELECT *
FROM (
   SELECT year(OrderDate) AS y, month(OrderDate) AS m, 
          count(orderID) ccc,
          ROW_NUMBER() OVER (PARTITION BY YEAR(OrderDate) ORDER BY count(orderID) DESC) AS rn
    FROM Orders o
    group by year(OrderDate), month(OrderDate)
    ) oo
WHERE oo.rn = 1

I don't understand how [WHERE oo.rn = 1] works.

Because, logically, the column rn(row number) refers to a column in the table with the alias 'o' (in the subquery) and not the table with the alias 'oo' (in the general query).

In my logical opinion it should have been [WHERE o.rn = 1]


Solution

  • In SQL, the result of a query can be thought of as a table in its own right.

    You can "refer" to the result of a query and treat it as such, and when doing so it's known as a derived table - ie, it's a table derived from the output of another query.

    In your example, SELECT year(OrderDate)... is a query and the select * is using its results. A rule of the syntax is that a derived table must be given an alias by which it can be referred to, the alias in this case is oo, which can be used to refer to any column that's part of the inner queries' results.

    Because there are no other "tables" referenced in the top level, the query is essentially saying select * from table, using the alias is not required and by not using oo. the query parser knows it must refer to the only derived table that exists, so where rn=1 would work just as it would if you were selecting * from an actual table.