Search code examples
azurerow-numberu-sql

Azure / U-SQL - ORDER BY in ROW_NUMBER()


I'm building a U-SQL script that has several intermediate tables. I'd like to assign ROW_NUMBER() based on the ordering of one of the columns but can't figure out how to do so.

@t = SELECT *
     FROM(
        VALUES
        ( 1, "2010-01-01","TagName1", 30 ),
        ( 2, "2010-01-01","TagName2", 5 ),
        ( 2, "2010-01-02","TagName2", 7 ),
        ( 2, "2010-01-02","TagName3", 6 ),
        ( 3, "2010-01-03","TagName2", 15 ),
        ( 1, "2010-01-01","TagName2", 2 ),
        ( 1, "2010-01-01","TagName3", 1),
        ( 3, "2010-01-04","TagName1", 2 ),
        ( 3, "2010-01-04","TagName2", 4 )
     ) AS T(DeviceID, Date, TagName, dv);

@t2 = SELECT DISTINCT(dv) as dv
             , ROW_NUMBER() OVER() as LevelNumber
      FROM @t
      ORDER BY dv;

If I run the above code it tells me to move the ORDER BY to the OUTPUT statement but then this doesn't work when the table contents changes in subsequent manipulations.

I tried putting the column name in the OVER() part but this doesn't work either.

I'd like something like this:

  dv          LevelNumber
  1                1
  2                2      
  4                3
  5                4

Any suggestions?


Solution

  • You can include ORDER BY and/or PARTITION BY clauses in the OVER clause. You can move the ORDER BY clause to the OUTPUT statement, eg

    @t = SELECT *
         FROM(
            VALUES
            ( 1, "2010-01-01","TagName1", 30 ),
            ( 2, "2010-01-01","TagName2", 5 ),
            ( 2, "2010-01-02","TagName2", 7 ),
            ( 2, "2010-01-02","TagName3", 6 ),
            ( 3, "2010-01-03","TagName2", 15 ),
            ( 1, "2010-01-01","TagName2", 2 ),
            ( 1, "2010-01-01","TagName3", 1),
            ( 3, "2010-01-04","TagName1", 2 ),
            ( 3, "2010-01-04","TagName2", 4 )
         ) AS T(DeviceID, Date, TagName, dv);
    
    
    //@t2 = SELECT DISTINCT(dv) as dv
    //             , ROW_NUMBER() OVER() as LevelNumber
    //      FROM @t
    //      ORDER BY dv;
    
    
    @output =
        SELECT DeviceID,
               Date,
               TagName,
               dv,
               ROW_NUMBER() OVER( ORDER BY dv ) AS LevelNumber
        FROM @t;
    
    
    OUTPUT @output TO "/output/output.csv"
    ORDER BY dv
    USING Outputters.Csv();
    

    The closest I could get to your result was with this query, but the logic you need is not 100% clear to me:

    @output =
        SELECT dv,
               ROW_NUMBER() OVER(ORDER BY dv) AS LevelNumber
        FROM
        (
            SELECT DISTINCT dv
            FROM @t
        ) AS x;