Search code examples
sqlwindow-functions

Can a sub-select with Row_Num always emulate a window function?


If a row_number is added to every row, is it possible to emulate everything that an analytic function can do with a subselect? For example:

WITH res(id, year, sales) as (
    SELECT 'Apple', 2010, 100 UNION SELECT 'Apple', 2011, 150 UNION SELECT 'Apple', 2012, 120 UNION
    SELECT 'Google', 2010, 301 UNION SELECT 'Google', 2011, 400 UNION SELECT 'Google', 2011, 450
),
res_with_row AS (
    SELECT *, ROW_NUMBER() OVER () AS row_num FROM res
)
SELECT *
    ,LAG(sales) OVER (PARTITION BY id ORDER BY YEAR) sales_last_year_analytic
    ,(SELECT sales FROM res_with_row as _inner 
           WHERE _inner.row_num=res_with_row.row_num-1 -- emulate LAG
           AND _inner.id=res_with_row.id               -- emulate PARITION BY id
        ) sales_last_year_subselect
FROM res_with_row;

enter image description here

Or, are there any things that an analytic function can do that couldn't be emulated with a correlated subquery if the user had access to the rowid of that entry?


Note: I've tagged this with three database backends. I have access to all three and the syntax should be applicable to all. If someone thinks it's better suited to removing those tags, please feel free to do so.


Solution

  • It's possible to emulate window functions - there are a lot of questions in Stack Overflow for SQL Server 2008 R2 and older where issues are solved without them as they are not supported.

    The window functions are about performance and better readability. If you compare the execution plans (SQL Server) for your example:

    WITH res(id, year, sales) as (
        SELECT 'Apple', 2010, 100 UNION SELECT 'Apple', 2011, 150 UNION SELECT 'Apple', 2012, 120 UNION
        SELECT 'Google', 2010, 301 UNION SELECT 'Google', 2011, 400 UNION SELECT 'Google', 2011, 450
    ),
    res_with_row AS (
        SELECT *, ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS row_num FROM res
    )
    SELECT *
        ,LAG(sales) OVER (PARTITION BY id ORDER BY YEAR) sales_last_year_analytic
      
    FROM res_with_row;
    
    
    
    WITH res(id, year, sales) as (
        SELECT 'Apple', 2010, 100 UNION SELECT 'Apple', 2011, 150 UNION SELECT 'Apple', 2012, 120 UNION
        SELECT 'Google', 2010, 301 UNION SELECT 'Google', 2011, 400 UNION SELECT 'Google', 2011, 450
    ),
    res_with_row AS (
        SELECT *, ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS row_num FROM res
    )
    SELECT *
         ,(SELECT sales FROM res_with_row as _inner 
               WHERE _inner.row_num=res_with_row.row_num-1 -- emulate LAG
               AND _inner.id=res_with_row.id               -- emulate PARITION BY id
            ) sales_last_year_subselect
    FROM res_with_row;
    

    See how more complex is the second one (without window function used). Imagine query where you need to calculate several metrics using window functions.

    enter image description here

    Also, in some cases, it's not possible to emulate them in the context of a single query. I often see legacy code where the ordering is done via separate table. For example:

    1. we have records with RowID column displaying how rows are ordered

    2. we delete the 5th and the 8th records.

    3. now, in order to add new sequential RowID, the devs used something like this:

      DECLARE @DataSource TABLE
      (
          [PrimaryKeyColumn] INT
         ,[RowID] INT IDENTITY(1,1)
      );
      
      INSERT INTO @DataSource ([PrimaryKeyColumn])
      SELECT [PrimaryKeyColumn]
      FROM [my_table]
      ORDER BY [RowID]
      
      UPDATE [my_table]
      SET [RowID] = DS.[RowID]
      FROM [my_table] T
      INNER JOIN @DataSource DS
          ON T.[PrimaryKeyColumn] = DS.[PrimaryKeyColumn];
      

    The above does seems strange as we have INSERT and ORDER BY. Generating the RowID with values seems better. This is based on that the ORDER BY is not ignored when there is IDENTITY column.

    Also, ordering the rows in the database, not when data is extracted seems wrong but this is another topic.